SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Object Stores for HPC: a Devonian Explosion or an Extinction Event?

Authors: Philippe Deniel (Atomic Energy and Alternative Energies Commission (CEA)), Johann Lombardi (Intel Corporation), John Bent (Seagate Systems), Tiago Quinto (European Centre for Medium-Range Weather Forecasts (ECMWF))

Abstract: The rise of exascale brings major challenges for storage systems, reaching the scalability limits with legacy solutions such as POSIX and parallel file systems. As HPC experts rethink data access paths in the data-intensive HPC paradigm, object store seems a promising path. Both scalable and flexible, it provides options within the HPC ecosystem, whereas it became a cornerstone for cloud. The object storage paradigm lives a plentiful evolution, but has many facets. Is it the next stage of evolution for data storage in HPC as we move towards exascale and beyond? Or is it just a passing fad?

Long Description: Exascale supercomputers will bring HPC to a new level : more compute power, higher precision simulations and more detailed results. The design of such machines will involve tens or even hundreds of thousands of nodes, tens of millions of compute cores and a total RAM up to several petabytes. From the storage systems point of view, important challenges appear : how to deal with billions of tens of billions of files and directories ? How to provide the correct bandwidth ? How to deal with a population of files that will be quite heterogeneous, with a wide range or size (a few bytes to several terabytes) and usage (files hosting databases are not to be handled like movie files) ? POSIX and the parallel file system model have been successfully used in the past decades, but structural constraints seem to prevent their scaling to meet Exascale requirements.

The object store model looks like a very promising path for efficient access to data . It has very simple semantics that make it both scalable and flexible, making it a base on top of which more complex semantics can be build (such as POSIX itself). The object store was born with the streaming industry and cloud providers. As this population remains tied to S3, tools appear in the HPC domain, each with their specific features: Ceph/RADOS, DAOS, CORTX/MOTR, RED, Scality… Dealing with objects in an exascale context is a complex problem, it does not have a unique solution, each product tackles it via different facets, leading thus to different API and behaviors. Object Stores for the HPC currently presents a plentiful evolution, similar to the evolution of life during the Devonian era. In a world where many de-facto standards co-exist, common trends and technical directions have to be identified. Evolution is cruel, only the best fitted lifeforms will survive at the end, if managed badly the object store model may experience an “extinction event”.

This BoF session gathers actors involved in the object store model : industrials, research institutes exploiting it in production or development tools based on it, massive use-case owner using object store in everyday life. It will try to identify the major trends and the core requirements for defining what an exascale capable object store should be.


Back to Birds of a Feather Archive Listing