Evaluation of Performance Portability of Applications and Mini-Apps across AMD, Intel, and NVIDIA GPUs
Parallel Programming Languages and Models
TimeSunday, 14 November 20212pm - 2:30pm CST
DescriptionThis paper evaluates the progress being made on achieving performance portability by ECP applications, or their proxy-applications, across a diverse spectrum of applications domains and approaches to achieving performance portability. The applications or proxy-apps evaluated are AMR-Wind, HACC, SW4, GAMESS RI-MP2, XSBench, and TestSNAP. These codes are being redeveloped using the SYCL, OpenMP, RAJA, or Kokkos programming models, or the AMReX frame-work. In this paper we assess their performance portability across the AMD MI100, Intel Gen9, and NVIDIA A100 GPUs. Since each GPU has different performance characteristics we have utilized the roofline performance model to compute the performance efficiency and evaluate performance portability across the platforms. The merits of different metrics for quantifying performance portability are considered and a metric based on the standard deviation of roofline efficiencies is proposed as a preferred metric. Finally, observations on developer productivity are made based on the experience gained working with these applications.