As a representative model for privacy preserving data publishing, K-anonymity has raised a considerable number of questions for researchers over the past few decades. Among them, how to achieve data release without sacrificing the users’ privacy and how to maximize the availability of published data is the ultimate goal of privacy preserving data publishing. In order to enhance the clustering effect and reduce the unnecessary computation, this paper proposes a weighted K-member clustering algorithm. A series of weight indicators are designed to evaluate the outlyingness of records, distance between records, and information loss of the published data. The proposed algorithm can reduce the influence of outliers on the clustering effect and maintain the availability of data to the best possible extent during the clustering process. Experimental analysis suggests that the proposed method generates lower information loss, improves the clustering effect, and is less sensitive to outliers as compared with some existing methods.
- Privacy preserving data publishing
- Information loss