Abstract
In the recent past, graph processing has attracted much attention particularly with the development of Google's Pregel. What has followed is the development of open source counterparts, Apache Giraph and GraphLab. These systems enable the distributed processing of large and complex graphs, such as web graphs and social networks. However, the efficacy of such distributed processing heavily depends on resource provisioning even in clouds with increasingly abundant resources. In this paper, we present resource provisioning models for memory-intensive graph processing applications. In particular, we profile their memory usage pattern while considering their types and sizes. This profiling model enables to determine the "right" number of resources and workers (or containers in a graph processing framework). As such determination on resource provisioning level is subject to user's objective, we further provide a model to identify Pareto frontier of resource provisioning trade-offs between performance and cost. We use a graph drawing application (GILA [4]), implemented on Apache Giraph and Hadoop YARN, as a case study. Experimental results demonstrate an increase in performance by 15% - 35% with a cost trade-off through the optimization of worker count and the use of Pareto Optimal resources selection.
Original language | English |
---|---|
Title of host publication | ACSW '18 Proceedings of the Australasian Computer Science Week Multiconference |
Place of Publication | New York |
Publisher | Association for Computing Machinery |
Number of pages | 7 |
ISBN (Electronic) | 9781450354363 |
DOIs | |
Publication status | Published - 29 Jan 2018 |
Event | 2018 Australasian Computer Science Week Multiconference, ACSW 2018 - Brisbane, Australia Duration: 29 Jan 2018 → 2 Feb 2018 |
Conference
Conference | 2018 Australasian Computer Science Week Multiconference, ACSW 2018 |
---|---|
Country/Territory | Australia |
City | Brisbane |
Period | 29/01/18 → 2/02/18 |