A spatiotemporal compression based approach for efficient big data processing on Cloud

Chi Yang*, Xuyun Zhang, Changmin Zhong, Chang Liu, Jian Pei, Kotagiri Ramamohanarao, Jinjun Chen

*Corresponding author for this work

Research output: Contribution to journalArticle

52 Citations (Scopus)
3 Downloads (Pure)

Abstract

It is well known that processing big graph data can be costly on Cloud. Processing big graph data introduces complex and multiple iterations that raise challenges such as parallel memory bottlenecks, deadlocks, and inefficiency. To tackle the challenges, we propose a novel technique for effectively processing big graph data on Cloud. Specifically, the big data will be compressed with its spatiotemporal features on Cloud. By exploring spatial data correlation, we partition a graph data set into clusters. In a cluster, the workload can be shared by the inference based on time series similarity. By exploiting temporal correlation, in each time series or a single graph edge, temporal data compression is conducted. A novel data driven scheduling is also developed for data processing optimisation. The experiment results demonstrate that the spatiotemporal compression and scheduling achieve significant performance gains in terms of data size and data fidelity loss.

Original languageEnglish
Pages (from-to)1563-1583
Number of pages21
JournalJournal of Computer and System Sciences
Volume80
Issue number8
DOIs
Publication statusPublished - Dec 2014
Externally publishedYes

Bibliographical note

Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Keywords

  • Big data
  • Graph data
  • Spatiotemporal compression
  • Cloud computing
  • Scheduling
  • DATA SETS
  • MAPREDUCE

Cite this