TY - GEN
T1 - An overview of big data issues in privacy-preserving record linkage
AU - Vatsalan, Dinusha
AU - Karapiperis, Dimitrios
AU - Gkoulalas-Divanis, Aris
PY - 2019
Y1 - 2019
N2 - Nearly 90% of today's data have been produced only in the last two years! These data come from a multitude of human activities, including social networking sites, mobile phone applications, electronic medical records systems, e-commerce sites, etc. Integrating and analyzing this wealth and volume of data offers remarkable opportunities in sectors that are of high interest to businesses, governments, and academia. Given that the majority of the data are proprietary and may contain personal or business sensitive information, Privacy-Preserving Record Linkage (PPRL) techniques are essential to perform data integration. In this paper, we review existing work in PPRL, focusing on the computational aspect of the proposed algorithms, which is crucial when dealing with Big data. We propose an analysis tool for the computational aspects of PPRL, and characterize existing PPRL techniques along five dimensions. Based on our analysis, we identify research gaps in current literature and promising directions for future work.
AB - Nearly 90% of today's data have been produced only in the last two years! These data come from a multitude of human activities, including social networking sites, mobile phone applications, electronic medical records systems, e-commerce sites, etc. Integrating and analyzing this wealth and volume of data offers remarkable opportunities in sectors that are of high interest to businesses, governments, and academia. Given that the majority of the data are proprietary and may contain personal or business sensitive information, Privacy-Preserving Record Linkage (PPRL) techniques are essential to perform data integration. In this paper, we review existing work in PPRL, focusing on the computational aspect of the proposed algorithms, which is crucial when dealing with Big data. We propose an analysis tool for the computational aspects of PPRL, and characterize existing PPRL techniques along five dimensions. Based on our analysis, we identify research gaps in current literature and promising directions for future work.
KW - Privacy-Preserving Record Linkage
KW - Entity resolution
UR - http://www.scopus.com/inward/record.url?scp=85065757150&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-19759-9_8
DO - 10.1007/978-3-030-19759-9_8
M3 - Conference proceeding contribution
SN - 9783030197582
T3 - Lecture Notes in Computer Science
SP - 118
EP - 136
BT - Algorithmic Aspects of Cloud Computing
A2 - Disser, Yann
A2 - Verykios, Vassilios S.
PB - Springer, Springer Nature
CY - Cham, Switzerland
T2 - 4th International Symposium on Algorithmic Aspects of Cloud Computing, ALGOCLOUD 2018
Y2 - 20 August 2018 through 21 August 2018
ER -