SaC-FRAPP: a scalable and cost-effective framework for privacy preservation over big data on cloud

Xuyun Zhang*, Chang Liu, Surya Nepal, Chi Yang, Wanchun Dou, Jinjun Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

18 Citations (Scopus)


Big data and cloud computing are two disruptive trends nowadays, provisioning numerous opportunities to the current information technology industry and research communities while posing significant challenges on them as well. Cloud computing provides powerful and economical infrastructural resources for cloud users to handle ever increasing data sets in big data applications. However, processing or sharing privacy-sensitive data sets on cloud probably engenders severe privacy concerns because of multi-tenancy. Data encryption and anonymization are two widely-adopted ways to combat privacy breach. However, encryption is not suitable for data that are processed and shared frequently, and anonymizing big data and manage numerous anonymized data sets are still challenges for traditional anonymization approaches. As such, we propose a scalable and cost-effective framework for privacy preservation over big data on cloud in this paper. The key idea of the framework is that it leverages cloud-based MapReduce to conduct data anonymization and manage anonymous data sets, before releasing data to others. The framework provides a holistic conceptual foundation for privacy preservation over big data. Further, a corresponding proof-of-concept prototype system is implemented. Empirical evaluations demonstrate that scalable and cost-effective framework for privacy preservation can anonymize large-scale data sets and mange anonymous data sets in a highly flexible, scalable, efficient, and cost-effective fashion.

Original languageEnglish
Pages (from-to)2561-2576
Number of pages16
JournalConcurrency Computation Practice and Experience
Issue number18
Publication statusPublished - 25 Dec 2013
Externally publishedYes
EventInternational Conference on Cloud and Green Computing (CGC) - Sydney, Australia
Duration: 11 Dec 201113 Dec 2011


  • big data
  • cloud
  • privacy preservation
  • framework
  • anonymization

Cite this