SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Beyond the Hype: Is There a Typical AI/ML Storage Workload?

Moderator: Dean Hildebrand (Google LLC)

Panelists: Ana Klimovic (ETH Zürich), Feiyi Wang (Oak Ridge National Laboratory (ORNL)), Anthony Kougkas (Illinois Institute of Technology), Chris (CJ) Newburn (NVIDIA Corporation), Abdulrahman Salem (Google LLC)

Abstract: Is your storage really optimized for AI/ML? What does that even mean? There are many claims about AI/ML needs for storage but very few well-defined workload and technology requirements. Traditional HPC parallel file systems are optimized for writing large checkpoints, and cloud object stores are optimized for storing massive datasets, but both are somehow supporting large scale AI/ML workloads. Are these two very different storage systems really optimized for an I/O workload that didn’t even exist a few years ago? This panel brings together experienced professionals from academia, the US national labs and industry to discuss the current state of storage for AI/ML, and find the elusive AI/ML storage requirements based on their experiences trying to use these systems to support AI/ML workloads. The moderator is Dean Hildebrand (Google) and deputy moderator is Jay Lofstead (Sandia National Laboratories).


Back to the Panel Archive Listing