As a representative model for privacy preserving data publishing, K-anonymity has raised a considerable number of questions for researchers over the past few decades. Among them, how to achieve data release without sacrificing the users’ privacy and how to maximize the availability of published data is the ultimate goal of privacy preserving data publishing. In order to enhance the clustering effect and reduce the unnecessary computation, this paper proposes a weighted K-member clustering algorithm. A series of weight indicators are designed to evaluate the outlyingness of records, distance between records, and information loss of the published data. The proposed algorithm can reduce the influence of outliers on the clustering effect and maintain the availability of data to the best possible extent during the clustering process. Experimental analysis suggests that the proposed method generates lower information loss, improves the clustering effect, and is less sensitive to outliers as compared with some existing methods.
Bibliographical noteFunding Information:
The research-at-hand is duly supported by National Nature Science Foundation of China (Nos. 61762059, 61762060, and 61862040).
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH, AT part of Springer Nature.
Copyright 2021 Elsevier B.V., All rights reserved.
- Information loss
- Privacy preserving data publishing