An Adaptive Elasticity Policy For Staging Based In-Situ Processing
Cloud and Distributed Computing
TimeMonday, 15 November 20212:25pm - 2:50pm CST
DescriptionIn-situ processing alleviates the gap between computation and I/O capabilities by performing data analysis close to the data source. With simulation data varying in size and content during workflow execution, it becomes necessary for in-situ processing to support resource elasticity, i.e., the ability to change resource configurations such as the number of computing nodes/processes during workflow execution. An elastic job may dynamically adjust resource configurations; it may use a few resources at the beginning but uses more resources towards the end of the job when interesting data appears. However, it is hard to predict a priori how many computing nodes/processes need to be added/removed during the workflow execution to adapt to changing workflow needs. How to efficiently guide elasticity operations, such as growing or shrinking process quantity used for in-situ processing during workflow execution, is an open-ended research question. In this paper, we present an adaptive elasticity policy that adopts workflow runtime information collected online to predict how to trigger the addition and removal of processes in order to minimize in-situ processing overheads. We integrate the presented elasticity policy into a staging-based elastic workflow and evaluate its efficiency in multiple elasticity scenarios. The results indicate that an adaptive elasticity policy can save overhead in finding a proper resource configuration when compared with a static policy that uses a fixed number of processes for each rescaling operation. Finally, we discuss multiple existing research opportunities of elastic in-situ processing from different aspects.