SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

I/O Bottleneck Detection and Tuning: Connecting the Dots Using Interactive Log Analysis


Workshop:PDSW: Sixth International Parallel Data Systems Workshop

Authors: Jean Luca Bez and Houjun Tang (Lawrence Berkeley National Laboratory (LBNL)), Bing Xie (Oak Ridge National Laboratory (ORNL)), David Williams-Young (Lawrence Berkeley National Laboratory (LBNL)), Rob Latham and Rob Ross (Argonne National Laboratory (ANL)), Sarp Oral (Oak Ridge National Laboratory (ORNL)), and Suren Byna (Lawrence Berkeley National Laboratory (LBNL))


Abstract: Using parallel file systems efficiently is a tricky problem due to inter-dependencies among multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO, POSIX, and file systems (GPFS, Lustre, etc.). Profiling tools such as Darshan collect traces to help understand the I/O performance behavior. However, there are significant gaps in analyzing the collected traces and then applying tuning options offered by various layers of I/O software. Seeking to connect the dots between I/O bottleneck detection and tuning, we propose DXT Explorer, an interactive log analysis tool. In this paper, we present a case study using our interactive log analysis tool to identify and apply various I/O optimizations. We report an evaluation of performance improvement achieved for four I/O kernels extracted from science applications.





Back to PDSW: Sixth International Parallel Data Systems Workshop Archive Listing



Back to Full Workshop Archive Listing