Data sharing pattern aware scheduling on grids

Choon Lee Young*, Albert Y. Zomaya

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

9 Citations (Scopus)


These days an increasing number of applications, especially in science and engineering, are dealing with a massive amount of data; hence they are dataintensive. Bioinformatics, data-mining and image processing are some typical areas of data-intensive applications. Such applications tend to be deployed on grids that provide powerful processing capabilities at reasonable cost. One fundamental scheduling issue, that arises when exploiting grids with these types of applications, is the minimization of data transfer. Therefore, the use of an efficient scheduling scheme that takes into account data transfers is rather essential in order to achieve both a shorter application completion time and efficient system utilization. In this paper, a novel scheduling algorithm, called the Shared Input data based Listing (SIL) algorithm for data-intensive bag-of-tasks (DBoT) applications in grid environments is proposed. The algorithm uses a set of task lists that are constructed taking the data sharing pattern into account and that are reorganized dynamically, based on performance of resources, during the execution of the application. The primary goal of this dynamic listing is to minimize data transfer, thus leading to shortening the overall completion time of DBoT applications. SIL further attempts to reduce serious schedule increases by adopting task duplication. In our evaluation study extensive simulation tests with three different types of the DBoT application model have been conducted. Based on the experimental results, SIL noticeably outperforms two previously proposed algorithms in schedule length.

Original languageEnglish
Title of host publicationICPP 2006: Proceedings of the 2006 International Conference on Parallel Processing
Place of PublicationLos Alamitos, CA
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Print)0769526365, 9780769526362
Publication statusPublished - 2006
Externally publishedYes
EventICPP 2006: 2006 International Conference on Parallel Processing - Columbus, OH, United States
Duration: 14 Aug 200618 Aug 2006


OtherICPP 2006: 2006 International Conference on Parallel Processing
CountryUnited States
CityColumbus, OH


Dive into the research topics of 'Data sharing pattern aware scheduling on grids'. Together they form a unique fingerprint.

Cite this