Abstract
Scientists in different fields, such as high energy physics, earth science, and astronomy are developing large-scale workflow applications. In many use cases, scientists need to run a set of interrelated but independent workflows (i.e., Workflow ensembles) for the entire scientific analysis. As a workflow ensemble usually contains many sub-workflows in each of which hundreds or thousands of jobs exist with precedence constraints, the execution of such a workflow ensemble makes a great concern with cost even using elastic and pay-as-you-go cloud resources. In this paper, we address two main challenges in executing large-scale workflow ensembles in public clouds with both cost and deadline constraints: (1) execution coordination, and (2) resource provisioning. To this end, we develop a new pulling based workflow execution system with a profiling-based resource provisioning strategy. The idea is homogeneity in both scientific workflows and cloud resources can be exploited to remove scheduling overhead (in execution coordination) and to minimize cost meeting deadline. Our results show that our solution system can achieve 80% speed-up, by removing scheduling overhead, compared to the well-known Pegasus workflow management system when running scientific workflow ensembles. Besides, our evaluation using Montage (an astronomical image mosaic engine) workflow ensembles on around 1000-core Amazon EC2 clusters has demonstrated the efficacy of our resource provisioning strategy in terms of cost effectiveness within deadline.
Original language | English |
---|---|
Title of host publication | Proceedings - 2015 44th International Annual Conference on Parallel Processing, ICPP 2015 |
Place of Publication | Picataway, NJ |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 520-529 |
Number of pages | 10 |
ISBN (Electronic) | 9781467375870 |
ISBN (Print) | 9781467375887 |
DOIs | |
Publication status | Published - 2015 |
Event | 44th International Conference on Parallel Processing, ICPP 2015 - Beijing, China Duration: 1 Sept 2015 → 4 Sept 2015 |
Other
Other | 44th International Conference on Parallel Processing, ICPP 2015 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 1/09/15 → 4/09/15 |