Pilgrim: Scalable and (Near) Lossless MPI Tracing
TimeWednesday, 17 November 20211:30pm - 2pm CST
DescriptionTraces of MPI communications are used by many performance analysis and visualization tools. Storing exhaustive traces of large scale MPI applications is infeasible, due to their large volume. Aggregated or lossy MPI traces are smaller, but provide much less information. In this paper, we present Pilgrim, a near lossless MPI tracing tool that incurs moderate overheads and generates small trace files at large scales, by using sophisticated compression techniques. Furthermore, for codes with regular communication patterns, Pilgrim can store their traces in constant space regardless of the problem size, the number of processors and the number of iterations. In comparison with existing tools, Pilgrim preserves more information with less space in all programs we tested.