A weighted K-member clustering algorithm for K-anonymization

Yan Yan, Eyeleko Anselme Herman*, Adnan Mahmood, Tao Feng, Pengshou Xie

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

As a representative model for privacy preserving data publishing, K-anonymity has raised a considerable number of questions for researchers over the past few decades. Among them, how to achieve data release without sacrificing the users’ privacy and how to maximize the availability of published data is the ultimate goal of privacy preserving data publishing. In order to enhance the clustering effect and reduce the unnecessary computation, this paper proposes a weighted K-member clustering algorithm. A series of weight indicators are designed to evaluate the outlyingness of records, distance between records, and information loss of the published data. The proposed algorithm can reduce the influence of outliers on the clustering effect and maintain the availability of data to the best possible extent during the clustering process. Experimental analysis suggests that the proposed method generates lower information loss, improves the clustering effect, and is less sensitive to outliers as compared with some existing methods.

Original languageEnglish
Number of pages23
JournalComputing
Early online date20 Feb 2021
DOIs
Publication statusE-pub ahead of print - 20 Feb 2021

Bibliographical note

Funding Information:
The research-at-hand is duly supported by National Nature Science Foundation of China (Nos. 61762059, 61762060, and 61862040).

Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH, AT part of Springer Nature.

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Keywords

  • Clustering
  • Information loss
  • K-anonymity
  • Outliers
  • Privacy preserving data publishing

Fingerprint

Dive into the research topics of 'A weighted K-member clustering algorithm for K-anonymization'. Together they form a unique fingerprint.

Cite this