Hashing-based distributed multi-party blocking for privacy-preserving record linkage

Thilina Ranbaduge*, Dinusha Vatsalan, Peter Christen, Vassilios S. Verykios

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

13 Citations (Scopus)

Abstract

In many application domains organizations require information from multiple sources to be integrated. Due to privacy and confidentiality concerns often these organizations are not willing or allowed to reveal their sensitive and personal data to other database owners, and to any external party. This has led to the emerging research discipline of privacy-preserving record linkage (PPRL). We propose a novel blocking approach for multi-party PPRL to efficiently and effectively prune the record sets that are unlikely to match. Our approach allows each database owner to perform blocking independently except for the initial agreement of parameter settings and a final central hashing-based clustering. We provide an analysis of our technique in terms of complexity, quality, and privacy, and conduct an empirical study with large datasets. The results show that our approach is scalable with the size of the datasets and the number of parties, while providing better quality and privacy than previous multi-party private blocking approaches.
Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication20th Pacific-Asia Conference, PAKDD 2016 Auckland, New Zealand, April 19–22, 2016 Proceedings, Part II
EditorsJames Bailey, Latifur Khan, Takashi Washio, Gillian Dobbie, Joshua Zhexue Huang, Ruili Wang
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Pages415-427
Number of pages13
ISBN (Electronic)9783319317502
ISBN (Print)9783319317496
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016 - Auckland, New Zealand
Duration: 19 Apr 201622 Apr 2016

Conference

Conference20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016
Country/TerritoryNew Zealand
CityAuckland
Period19/04/1622/04/16

Keywords

  • Locality sensitive hashing
  • Clustering
  • Bloom filters

Fingerprint

Dive into the research topics of 'Hashing-based distributed multi-party blocking for privacy-preserving record linkage'. Together they form a unique fingerprint.

Cite this