Combining top-down and bottom-up: scalable sub-tree anonymization over big data using MapReduce on cloud

Xuyun Zhang*, Chang Liu, Surya Nepal, Chi Yang, Wanchun Dou, Jinjun Chen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

20 Citations (Scopus)

Abstract

In big data applications, data privacy is one of the most concerned issues because processing large-scale privacy-sensitive data sets often requires computation power provided by public cloud services. Sub-tree data anonymization, achieving a good trade-off between data utility and distortion, is a widely adopted scheme to anonymize data sets for privacy preservation. Top-Down Specialization (TDS) and Bottom-Up Generalization (BUG) are two ways to fulfill sub-tree anonymization. However, existing approaches for sub-tree anonymization fall short of parallelization capability, thereby lacking scalability in handling big data on cloud. Still, both TDS and BUG suffer from poor performance for certain value of k-anonymity parameter if they are utilized individually. In this paper, we propose a hybrid approach that combines TDS and BUG together for efficient sub-tree anonymization over big data. Further, we design MapReduce based algorithms for two components (TDS and BUG) to gain high scalability by exploiting powerful computation capability of cloud. Experiment evaluations demonstrate that the hybrid approach significantly improves the scalability and efficiency of sub-tree anonymization scheme over existing approaches.

Original languageEnglish
Title of host publication2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom 2013)
Place of PublicationLos Alamitos
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages501-508
Number of pages8
ISBN (Electronic)9780769550220
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2013 - Melbourne, VIC, Australia
Duration: 16 Jul 201318 Jul 2013

Publication series

NameIEEE International Conference on Trust Security and Privacy in Computing and Communications
PublisherIEEE
ISSN (Print)2324-898X

Other

Other12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2013
CountryAustralia
CityMelbourne, VIC
Period16/07/1318/07/13

Keywords

  • Big data
  • cloud computing
  • data anonymization
  • privacy preservation
  • MapReduce

Cite this