No Travel? No Problem.

Remote Participation
Feasibility of Running Singularity Containers with Hybrid MPI on NASA High-End Computing Resources
Event Type
Workshop
Tags
Applications
Emerging Technologies
Reproducibility and Transparency
Software Engineering
System Software and Runtime Systems
Workflows
Registration Categories
W
TimeSunday, 14 November 20212pm - 2:30pm CST
Location231-232
DescriptionThis work investigates the feasibility of a Singularity solution to support running MPI applications in “hybrid” MPI mode on NASA’s HECC resources. Two types of applications were tested: HPC and AI/ML. On the HPC side, two JEDI containers built with Intel MPI for Earth science modeling were tested on both HECC in-house and HECC AWS Cloud CPU resources. On the AI/ML side, a NVIDIA TensorFlow container built with OpenMPI was tested with a NCF recommender system and the ResNet-50 computer image system on the HECC in-house V100 GPUs. Our exercises demonstrate that although porting containers to run with a single node using just the container MPI is quite straightforward, running across multiple nodes in hybrid MPI mode requires knowledge of Singularity, MPI libraries, the operating system image, and the communication infrastructure such as the transport and network layers.
Back To Top Button