Abstract
It is well known that processing big graph data can be costly on Cloud. Processing big graph data introduces complex and multiple iterations that raise challenges such as parallel memory bottlenecks, deadlocks, and inefficiency. To tackle the challenges, we propose a novel technique for effectively processing big graph data on Cloud. Specifically, the big data will be compressed with its spatiotemporal features on Cloud. By exploring spatial data correlation, we partition a graph data set into clusters. In a cluster, the workload can be shared by the inference based on time series similarity. By exploiting temporal correlation, in each time series or a single graph edge, temporal data compression is conducted. A novel data driven scheduling is also developed for data processing optimisation. The experiment results demonstrate that the spatiotemporal compression and scheduling achieve significant performance gains in terms of data size and data fidelity loss.
Original language | English |
---|---|
Pages (from-to) | 1563-1583 |
Number of pages | 21 |
Journal | Journal of Computer and System Sciences |
Volume | 80 |
Issue number | 8 |
DOIs | |
Publication status | Published - Dec 2014 |
Externally published | Yes |
Bibliographical note
Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Keywords
- Big data
- Graph data
- Spatiotemporal compression
- Cloud computing
- Scheduling
- DATA SETS
- MAPREDUCE