TY - JOUR
T1 - Rescheduling for reliable job completion with the support of clouds
AU - Lee, Young Choon
AU - Zomaya, Albert Y.
PY - 2010/10
Y1 - 2010/10
N2 - A major performance issue in large-scale decentralized distributed systems, such as grids, is how to ensure that jobs finish their execution within the estimated completion times in the presence of resource performance fluctuations. Previously, several techniques including advance reservation, rescheduling and migration have been adopted to resolve/relieve this issue; however, they have some non-negligent practicality hurdles. The use of clouds may be an attractive alternative, since resources in clouds are much more reliable than those in grids. This paper investigates the effectiveness of rescheduling using cloud resources to increase the reliability of job completion. Specifically, schedules are initially generated using grid resources, and cloud resources (relatively costlier) are used only for rescheduling to cope with a delay in job completion. A job in our study refers to a bag-of-tasks (BoT) application that consists of a large number of independent tasks; this job model is common in many science and engineering applications. We have devised a novel rescheduling technique, called rescheduling using clouds for reliable completion (RC2) and applied it to three well-known existing heuristics. Our experimental results reveal that RC2 significantly reduces delay in job completion.
AB - A major performance issue in large-scale decentralized distributed systems, such as grids, is how to ensure that jobs finish their execution within the estimated completion times in the presence of resource performance fluctuations. Previously, several techniques including advance reservation, rescheduling and migration have been adopted to resolve/relieve this issue; however, they have some non-negligent practicality hurdles. The use of clouds may be an attractive alternative, since resources in clouds are much more reliable than those in grids. This paper investigates the effectiveness of rescheduling using cloud resources to increase the reliability of job completion. Specifically, schedules are initially generated using grid resources, and cloud resources (relatively costlier) are used only for rescheduling to cope with a delay in job completion. A job in our study refers to a bag-of-tasks (BoT) application that consists of a large number of independent tasks; this job model is common in many science and engineering applications. We have devised a novel rescheduling technique, called rescheduling using clouds for reliable completion (RC2) and applied it to three well-known existing heuristics. Our experimental results reveal that RC2 significantly reduces delay in job completion.
UR - http://www.scopus.com/inward/record.url?scp=77955514224&partnerID=8YFLogxK
U2 - 10.1016/j.future.2010.02.010
DO - 10.1016/j.future.2010.02.010
M3 - Article
AN - SCOPUS:77955514224
SN - 0167-739X
VL - 26
SP - 1192
EP - 1199
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
IS - 8
ER -