
Toward Access Pattern Aware Checkpointing for Kokkos Applications
Event Type
ACM Student Research Competition: Graduate Poster
ACM Student Research Competition: Undergraduate Poster
Posters
In-Person Only
TP
XO / EX
TimeThursday, 18 November 20218:30am - 5pm CST
LocationSecond Floor Atrium
DescriptionThe common checkpoint philosophy, checkpoint everything as frequently as possible, is becoming ineffective as we progress towards exascale machines, facing shrinking time between failures. This makes portability and resilience vital for the future of HPC. This poster demonstrates the need and forms the foundation for enhancing checkpointing to take advantage of application properties. Specifically, we show how access pattern aware checkpointing improves performance using incremental checkpoints of sparsely updated data as an example. We also define how the portable checkpointing abstractions in Kokkos Resilience can be modified to support such an enhancement transparently.
Archive view