Authors: Jiri Jaros (Brno University of Technology, Czech Republic)
Abstract: Handling error states in C++ applications is managed by exceptions. In distributed applications, it is necessary to inform the other processes, that something wrong happened, and the application should either recover from the faulty state, or report the error and terminate gracefully. Unfortunately, the MPI standard does not provide any support for distributed error handling.
This poster presents a new approach for exceptions-handling in MPI applications. The goals are: to report any faulty state to the user in a nicely formatted way by just a single rank; to ensure the application will never deadlock; and to propose a simple interface and ensure interoperability with other C/C++ libraries. The code was tested with several injected errors into multiple ranks such as non-existing input file, disk quota exceeded, wrong rank in the MPI call and standard system exceptions. In all situations the code has worked properly.
Best Poster Finalist (BP): no
Poster: PDF
Poster summary: PDF
Back to Poster Archive Listing