Abstract
In many application domains organizations require information from multiple sources to be integrated. Due to privacy and confidentiality concerns often these organizations are not willing or allowed to reveal their sensitive and personal data to other database owners, and to any external party. This has led to the emerging research discipline of privacy-preserving record linkage (PPRL). We propose a novel blocking approach for multi-party PPRL to efficiently and effectively prune the record sets that are unlikely to match. Our approach allows each database owner to perform blocking independently except for the initial agreement of parameter settings and a final central hashing-based clustering. We provide an analysis of our technique in terms of complexity, quality, and privacy, and conduct an empirical study with large datasets. The results show that our approach is scalable with the size of the datasets and the number of parties, while providing better quality and privacy than previous multi-party private blocking approaches.
Original language | English |
---|---|
Title of host publication | Advances in Knowledge Discovery and Data Mining |
Subtitle of host publication | 20th Pacific-Asia Conference, PAKDD 2016 Auckland, New Zealand, April 19–22, 2016 Proceedings, Part II |
Editors | James Bailey, Latifur Khan, Takashi Washio, Gillian Dobbie, Joshua Zhexue Huang, Ruili Wang |
Place of Publication | Cham, Switzerland |
Publisher | Springer, Springer Nature |
Pages | 415-427 |
Number of pages | 13 |
ISBN (Electronic) | 9783319317502 |
ISBN (Print) | 9783319317496 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | 20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016 - Auckland, New Zealand Duration: 19 Apr 2016 → 22 Apr 2016 |
Conference
Conference | 20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016 |
---|---|
Country/Territory | New Zealand |
City | Auckland |
Period | 19/04/16 → 22/04/16 |
Keywords
- Locality sensitive hashing
- Clustering
- Bloom filters