Sequence data matching and beyond: new privacy-preserving primitives based on bloom filters

Wanli Xue*, Dinusha Vatsalan, Wen Hu, Aruna Seneviratne

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)

Abstract

Bloom filter encoding has widely been used as an efficient masking technique for privacy-preserving matching functions. The existing matching techniques, however, are limited to relatively simple types such as string, categorical and signal numerical values. In this paper, we propose a new scheme that significantly extends the class of matching primitives that are based on privacy-preserving Bloom filter mechanism. These primitives include sequence data matching and popular distance-based machine learning algorithms such as KNN and SVM. Our scheme hash-maps a sequence data vector into the Bloom filter space while checking the similarity of the data points efficiently with negligible utility loss by adding a timestamp (bit) for each element in the data represented with its neighboring values. Furthermore, it includes a Laplace-like perturbation method on the constructed Bloom filters to address the weakness of deterministic probability led by encoding techniques. As a result, the proposed work guarantee the private data records are difficult to be discriminated due to collisions and differential privacy. The experimental results on three real-scenario based datasets illustrate that our method can achieve a significantly better trade-off between utility and privacy than the state-of-the-art differential privacy-based method by adding Laplace noise to the data directly.

Original languageEnglish
Pages (from-to)2973-2987
Number of pages15
JournalIEEE Transactions on Information Forensics and Security
Volume15
Early online date16 Mar 2020
DOIs
Publication statusPublished - 2020
Externally publishedYes

Keywords

  • Sequence data matching
  • bloom filter
  • differential privacy
  • encoding
  • privacy-preserving data publishing

Fingerprint

Dive into the research topics of 'Sequence data matching and beyond: new privacy-preserving primitives based on bloom filters'. Together they form a unique fingerprint.

Cite this