Remote OpenMP Offloading

SC21 Proceedings

Remote OpenMP Offloading

Workshop:LLVM-HPC2021: The Seventh Workshop on the LLVM Compiler Infrastructure in HPC

Authors: Atmn Patel (University of Waterloo) and Johannes Doerfert (Argonne National Laboratory (ANL))

Abstract: In this work we show that the OpenMP accelerator offloading model is sufficient to seamlessly and efficiently utilize more than a single compute node, and its connected accelerators.

Without source code or compiler modifications we run an OpenMP offload capable program on a remote CPU, or remote accelerator (e.g., GPU), as if it was a local one. For applications that support multi-device offloading, any combination of local and remote CPUs and accelerators can be utilized simultaneously, fully transparent to the user. Our low-overhead implementation is integrated into the LLVM/OpenMP compiler infrastructure as a plugin and is publicly available (in parts) with LLVM 12 and later.

To evaluate our work we provide detailed studies on scaling results for two HPC proxy applications. We show perfect scaling across dozens of GPUs in multiple hosts with effectiveness proportional to the ratio of computation versus memory transfer time.

Back to LLVM-HPC2021: The Seventh Workshop on the LLVM Compiler Infrastructure in HPC Archive Listing

Back to Full Workshop Archive Listing