SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Feasibility of Running Singularity Containers with Hybrid MPI on NASA High-End Computing Resources


Workshop:CANOPIE-HPC: Containers and New Orchestration Paradigms for Isolated Environments in HPC

Authors: Yan-Tyng (Sherry) Chang, Steve Heistand, Robert Hood, and Henry Jin (NASA Ames Research Center)


Abstract: This work investigates the feasibility of a Singularity solution to support running MPI applications in “hybrid” MPI mode on NASA’s HECC resources. Two types of applications were tested: HPC and AI/ML. On the HPC side, two JEDI containers built with Intel MPI for Earth science modeling were tested on both HECC in-house and HECC AWS Cloud CPU resources. On the AI/ML side, a NVIDIA TensorFlow container built with OpenMPI was tested with a NCF recommender system and the ResNet-50 computer image system on the HECC in-house V100 GPUs. Our exercises demonstrate that although porting containers to run with a single node using just the container MPI is quite straightforward, running across multiple nodes in hybrid MPI mode requires knowledge of Singularity, MPI libraries, the operating system image, and the communication infrastructure such as the transport and network layers.





Back to CANOPIE-HPC: Containers and New Orchestration Paradigms for Isolated Environments in HPC Archive Listing



Back to Full Workshop Archive Listing