SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs


Workshop:ProTools 2021: Workshop on Programming and Performance Visualization Tools

Authors: Aaron Cherian, Keren Zhou, Dejan Grubisic, Xiaozhu Meng, and John Mellor-Crummey (Rice University)


Abstract: In this paper, we describe extensions to Rice University's HPCToolkit performance tools that support measurement and analysis of Intel's DPC++ programming model for GPU-accelerated systems atop an implementation of the industry-standard OpenCL framework for heterogeneous parallelism on Intel GPUs. HPCToolkit supports three techniques for performance analysis of programs atop OpenCL on Intel GPUs. First, HPCToolkit supports profiling and tracing of OpenCL kernels. Second, HPCToolkit supports CPU-GPU blame shifting for OpenCL kernel executions---a profiling technique that can identify code that executes on one or more CPUs while GPUs are idle. Third, HPCToolkit supports fine-grained measurement, analysis, and attribution of OpenCL GPU kernels, including instruction counts, execution latency, and SIMD waste. The paper describes these capabilities and then illustrates their application in case studies with two applications that offload computations onto Intel GPUs.





Back to ProTools 2021: Workshop on Programming and Performance Visualization Tools Archive Listing



Back to Full Workshop Archive Listing