Non-intrusive slot layering in Hadoop

Peng Lu, Young Choon Lee, Albert Y. Zomaya

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

8 Citations (Scopus)

Abstract

Hadoop, an open source implementation of MapReduce, uses slots to represent resource sharing. The number of slots in a Hadoop cluster node specifies the concurrency of task execution. Thus, the slot configuration has a significant impact on performance. The number of slots is by default hand-configured (static) and slots share resources "fairly". As resource capacity (e.g., #cores) continues to increase and application dynamics becomes increasingly diverse, the current practices of static slot configuration and fair resource sharing may not efficiently utilize resources. Besides, such fair sharing is against priority-based scheduling when high priority jobs are sharing resource with lower priority jobs. In this paper we study the optimization of resource utilization in Hadoop focusing on those two issues of current practices and present a non-intrusive slot layering solution. Our solution approach in essence uses two tiers of slot (Active and Passive) to increase the degree of concurrency with minimal performance interference between them. Tasks in the Passive slots proceed their execution when tasks in the Active slots are not fully using (CPU) resource, and tasks/slots in these tiers are dynamically and adaptively managed. To leverage the effectiveness of slot layering, we develop a layering-aware task scheduler. Our non-intrusive slot layering approach is unique in that (1) it is a generic way to manage resource sharing for parallel and distributed computing models (e.g., MPI and cloud computing) and (2) both overall throughput and high-priority job performance are improved. Our experimental results with 6 representative jobs show 3%-34% improvement in overall throughput and 13%-48% decrease in the executing time of high-priority jobs compared with static configurations.

Original languageEnglish
Title of host publicationProceedings - 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013
EditorsPavan Balaji, Dick Epema, Thomas Fahringer
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages253-260
Number of pages8
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013 - Delft, Netherlands
Duration: 13 May 201316 May 2013

Other

Other13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013
Country/TerritoryNetherlands
CityDelft
Period13/05/1316/05/13

Fingerprint

Dive into the research topics of 'Non-intrusive slot layering in Hadoop'. Together they form a unique fingerprint.

Cite this