Unbalanced Parallel I/O: An Often-Neglected Side Effect of Lossy Scientific Data Compression
TimeSunday, 14 November 20212:30pm - 3pm CST
DescriptionLossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds. However, one important yet often neglected side effect of lossy scientific data compression is its impact on the performance of parallel I/O. Our key observation is that the compressed data size is often highly skewed across processes in lossy scientific compression. To understand this behavior, we apply three lossy compressors, which are specifically designed and optimized for scientific data, to three real-world scientific applications. Our analysis demonstrates that the sizes of compressed data are always skewed even if the original data is evenly decomposed among processes. We then systematically study how this side effect of lossy scientific data compression and observe that the skewness in the sizes of the compressed data often leads to I/O imbalance, which can significantly reduce the efficiency of I/O bandwidth utilization if not properly handled.