SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

An Interactive GPU Metric Dashboard for HPC clusters

Authors: Wentao Shi (Louisiana State University) and Brandon Cook, Arghya Chatterjee, and Johannes Blaschke (Lawrence Berkeley National Laboratory (LBNL))

Abstract: Scaling up programs to run at the scale of a modern high-performance computing (HPC) center can be a daunting task. One of the first questions developers ask is: “Is my program using all the hardware available?” Many tools can extract detailed performance data on applications. But the level of detail that these tools deliver comes at a cost: significant resource and time must be invested in collecting and analyzing such performance data. But to answer a question like: “Am I using all four GPUs per node?”: this level of detail is overkill. So, this project aims to provide the developer a user-friendly application to have a peek in their program’s GPU utilization.

Best Poster Finalist (BP): no

