Workshop:WORKS21: 16th Workshop on Workflows in Support of Large-Scale Science
Authors: Devarshi Ghoshal, Ludovico Bianchi, Abdelilah Essiari, Drew Paine, and Sarah S. Poon (Lawrence Berkeley National Laboratory (LBNL)); Michael Beach (University of Washington); and Alpha T. N'Diaye, Patrick Huck, and Lavanya Ramakrishnan (Lawrence Berkeley National Laboratory (LBNL))
Abstract: Workflows are increasingly processing large volumes of data from scientific instruments, experiments and sensors. These workflows often consist of complex data processing and analysis steps that might involve human in the loop, and use a diverse set of analysis tools. Sharing and reproducing these workflows with collaborators and the larger community is critical but hard to do without the entire context of the workflow including user notes and execution environment. In this paper, we introduce Science Capsule that automatically captures and processes events associated with the execution and data life cycle of workflows, and provides ways to enhance the information with user artifacts. It also allows users to create 'workflow snapshots' that keep track of the different versions of a workflow and their lineage, allowing scientists to incrementally share and extend workflows between users. Our results show that Science Capsule is capable of processing and organizing events in near real-time for high-throughput experimental and analysis workflows without incurring any significant performance overheads.
Back to WORKS21: 16th Workshop on Workflows in Support of Large-Scale Science Archive Listing