SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

On The Experience of Long Time Data Collection on SuperMUC to Drive Energy Efficiency


Workshop:First International Symposium on Quantitative Codesign of Supercomputers

Authors: Martin Schulz (Technical University Munich)


Abstract: On our path to exascale, HPC systems are starting to push the limits in energy and power consumption that are technically, economically and politically feasible. Therefore, increasing energy efficiency is one of the most important design goals of HPC architectures and a central goal for most HPC System Co-Design efforts. However, energy efficiency can only be achieved – and later evaluated – if we first understand the energy and power consumption in both current and past architectures and their applications. For the SuperMUC series of HPC systems at the Leibniz Supercomputing Centre in Garching, Germany, we therefore implemented a comprehensive monitoring system (the Data Center Data Base and the PerSyst framework) as well as a matching analytics engine (Wintermute) that enables us, together with a comprehensive system instrumentation, to long-term track the energy efficiency of all SuperMUC systems. In this talk, I will discuss our infrastructure, the data collected, the insights we were able to gain, and how we were able to use the data to affect machine design and operation.


Website:






Back to First International Symposium on Quantitative Codesign of Supercomputers Archive Listing



Back to Full Workshop Archive Listing