No Travel? No Problem.

Remote Participation
On The Experience of Long Time Data Collection on SuperMUC to Drive Energy Efficiency
Presenter
Event Type
Workshop
Tags
Architectures
Data Analytics
Datacenter
Emerging Technologies
Extreme Scale Comptuing
Heterogeneous Systems
HPC Community Collaboration
Machine Learning and Artificial Intelligence
Performance
Resource Management and Scheduling
System Administration
System Software and Runtime Systems
Registration Categories
W
TimeFriday, 19 November 20218:40am - 9:10am CST
Location230-231-232
DescriptionOn our path to exascale, HPC systems are starting to push the limits in energy and power consumption that are technically, economically and politically feasible. Therefore, increasing energy efficiency is one of the most important design goals of HPC architectures and a central goal for most HPC System Co-Design efforts. However, energy efficiency can only be achieved – and later evaluated – if we first understand the energy and power consumption in both current and past architectures and their applications. For the SuperMUC series of HPC systems at the Leibniz Supercomputing Centre in Garching, Germany, we therefore implemented a comprehensive monitoring system (the Data Center Data Base and the PerSyst framework) as well as a matching analytics engine (Wintermute) that enables us, together with a comprehensive system instrumentation, to long-term track the energy efficiency of all SuperMUC systems. In this talk, I will discuss our infrastructure, the data collected, the insights we were able to gain, and how we were able to use the data to affect machine design and operation.
Back To Top Button