Workload characteristic oriented scheduler for MapReduce

Peng Lu*, Young Choon Lee, Chen Wang, Bing Bing Zhou, Junliang Chen, Albert Y. Zomaya

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

28 Citations (Scopus)

Abstract

Applications in many areas are increasingly developed and ported using the MapReduce framework (more specifically, Hadoop) to exploit (data) parallelism. The application scope of MapReduce has been extended beyond the original design goal which was large-scale data processing. This extension inherently makes a need for scheduler to explicitly take into account characteristics of job for two main goals of efficient resource use and performance improvement. In this paper, we study MapReduce scheduling strategies to effectively deal with different workload characteristics - CPU intensive and I/O intensive. We present the Workload Characteristic Oriented Scheduler (WCO), which strives for co-locating tasks of possibly different MapReduce jobs with complementing resource usage characteristics. WCO is characterized by its essentially dynamic and adaptive scheduling decisions using information obtained from its characteristic estimator. Workload characteristics of tasks are primarily estimated by sampling with the help of some static task selection strategies, e.g., Java bytecode analysis. Results obtained from extensive experiments using 11 benchmarks in a 4-node local cluster and a 51-node Amazon EC2 cluster show 17% performance improvement on average in terms of throughput in the situation of co-existing diverse workloads.

Original languageEnglish
Title of host publicationProceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems, ICPADS 2012
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages156-163
Number of pages8
ISBN (Print)9780769549033
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event18th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2012 - Singapore, Singapore
Duration: 17 Dec 201219 Dec 2012

Other

Other18th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2012
Country/TerritorySingapore
CitySingapore
Period17/12/1219/12/12

Fingerprint

Dive into the research topics of 'Workload characteristic oriented scheduler for MapReduce'. Together they form a unique fingerprint.

Cite this