SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

An Extended Roofline Performance Model with PCI-E and Network Ceilings


Workshop:PMBS21: The 12th International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computer Systems

Authors: Amanda Dufek, Jack Deslippe, and Paul Lin (Lawrence Berkeley National Laboratory (LBNL)); Charlene Yang (NVIDIA Corporation); Brandon Cook (Lawrence Berkeley National Laboratory (LBNL)); and Jonathan Madsen (Advanced Micro Devices (AMD) Inc)


Abstract: In this work, we evaluate the utility of adding two new diagonal ceilings to the roofline model related to PCI-E and effective network bandwidths to provide insights into how communication impacts the performance of large-scale parallel applications. The roofline performance analysis is based on two benchmark problems: scalar dense matrix addition and dense symmetric eigen-problem with complex matrix. The experiments were conducted on the NERSC Cori supercomputer at Lawrence Berkeley National Laboratory, on both the CPU-only and CPU+GPU compute nodes. The study reveals the value of incorporating these two new ceilings into the roofline model, in addition to the existing memory bandwidth and compute ceilings, in order to ease the identification of performance bottlenecks to better guide the performance optimization process, particularly in the limit of diminishing strong and weak scaling. We highlight the importance of comparing obtained application roofline points to customized ceilings for the communication and data-access patterns present. In this way, the effects of both throughput and latency can be captured in the model.





Back to PMBS21: The 12th International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computer Systems Archive Listing



Back to Full Workshop Archive Listing