Rescheduling for reliable job completion with the support of clouds

Young Choon Lee, Albert Y. Zomaya

Research output: Contribution to journalArticle

56 Citations (Scopus)


A major performance issue in large-scale decentralized distributed systems, such as grids, is how to ensure that jobs finish their execution within the estimated completion times in the presence of resource performance fluctuations. Previously, several techniques including advance reservation, rescheduling and migration have been adopted to resolve/relieve this issue; however, they have some non-negligent practicality hurdles. The use of clouds may be an attractive alternative, since resources in clouds are much more reliable than those in grids. This paper investigates the effectiveness of rescheduling using cloud resources to increase the reliability of job completion. Specifically, schedules are initially generated using grid resources, and cloud resources (relatively costlier) are used only for rescheduling to cope with a delay in job completion. A job in our study refers to a bag-of-tasks (BoT) application that consists of a large number of independent tasks; this job model is common in many science and engineering applications. We have devised a novel rescheduling technique, called rescheduling using clouds for reliable completion (RC2) and applied it to three well-known existing heuristics. Our experimental results reveal that RC2 significantly reduces delay in job completion.

Original languageEnglish
Pages (from-to)1192-1199
Number of pages8
JournalFuture Generation Computer Systems
Issue number8
Publication statusPublished - Oct 2010
Externally publishedYes

Cite this