Skip to main navigation Skip to search Skip to main content

Fairness-aware Privacy-Preserving Record Linkage

Dinusha Vatsalan*, Joyce Wang, Wilko Henecka, Brian Thorne

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Record linkage aims to identify records from different databases that correspond to the same real-world entity, while Privacy-Preserving Record Linkage (PPRL) conducts the linkage in a privacy-preserving context where private and sensitive information about individuals is not compromised. Linking records is considered as a classification task where pairs of records from different databases are classified into matches (i.e. they refer to the same entity) or non-matches (i.e. they refer to different entities). Due to the absence of unique entity identifiers across databases, commonly available quasi-identifiers (QIDs), such as name, gender, address, and date of birth, are used to determine the linkage. The values in such QIDs are often prone to data errors and variations making the linkage task challenging. Fairness in classification is an emerging concept that determines how much a classifier distorts from producing correct predictions with equal probabilities for individuals across different protected groups based on sensitive features (e.g. gender or race). Developing classifiers that are fair with respect to such sensitive features is an important problem for classification in general and specifically for PPRL to mitigate the bias against sensitive and/or minority groups, for example against female group due to higher likelihood of variations in the QIDs such as last name and address. While there have been increased interest in this field, fairness specifically in PPRL research has not been studied in the literature so far. Fairness for PPRL brings in specific challenges and requirements. In this paper, we study fairness for PPRL classifiers, analyse appropriate fairness criteria/metric for PPRL, study different forms of fairness-bias for PPRL and investigate the effectiveness of using fairness-aware PPRL. Our experimental results conducted on real and synthetically biased datasets show the efficacy and significance of incorporating fairness constraints in the linkage, leading to higher linkage quality in terms of both correctness and fairness.
Original languageEnglish
Title of host publicationData Privacy Management, Cryptocurrencies and Blockchain Technology
Subtitle of host publicationESORICS 2020 International Workshops, DPM 2020 and CBT 2020 Guildford, UK, September 17–18, 2020 Revised Selected Papers
EditorsJoaquin Garcia-Alfaro, Guillermo Navarro-Arribas, Jordi Herrera-Joancomarti
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Pages3-18
Number of pages16
ISBN (Electronic)9783030661724
ISBN (Print)9783030661717
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event15th Data Privacy Managmeent International Workshop (DPM 2020) - Guildford, United Kingdom
Duration: 17 Sept 202018 Sept 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12484 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th Data Privacy Managmeent International Workshop (DPM 2020)
Country/TerritoryUnited Kingdom
CityGuildford
Period17/09/2018/09/20

Keywords

  • Classification
  • Correctness
  • Entity resolution
  • Fairness
  • Privacy

Fingerprint

Dive into the research topics of 'Fairness-aware Privacy-Preserving Record Linkage'. Together they form a unique fingerprint.

Cite this