TY - GEN
T1 - Not one but many tradeoffs
T2 - 11th ACM SIGSAC Conference on Cloud Computing Security Workshop, CCSW 2020
AU - Zhao, Benjamin Zi Hao
AU - Kaafar, Mohamed Ali
AU - Kourtellis, Nicolas
PY - 2020
Y1 - 2020
N2 - Data holders are increasingly seeking to protect their user's privacy, whilst still maximizing their ability to produce machine learning (ML) models with high quality predictions. In this work, we empirically evaluate various implementations of differential privacy (DP), and measure their ability to fend off real-world privacy attacks, in addition to measuring their core goal of providing accurate classifications. We establish an evaluation framework to ensure each of these implementations are fairly evaluated. Our selection of DP implementations add DP noise at different positions within the framework, either at the point of data collection/release, during updates while training of the model, or after training by perturbing learned model parameters. We evaluate each implementation across a range of privacy budgets and datasets, each implementation providing the same mathematical privacy guarantees. By measuring the models' resistance to real world attacks of membership and attribute inference, and their classification accuracy. We determine which implementations provide the most desirable tradeoff between privacy and utility. We found that the number of classes of a given dataset is unlikely to influence where the privacy and utility tradeoff occurs, a counter-intuitive inference in contrast to the known relationship of increased privacy vulnerability in datasets with more classes. Additionally, in the scenario that high privacy constraints are required, perturbing input training data before applying ML modeling does not trade off as much utility, as compared to noise added later in the ML process.
AB - Data holders are increasingly seeking to protect their user's privacy, whilst still maximizing their ability to produce machine learning (ML) models with high quality predictions. In this work, we empirically evaluate various implementations of differential privacy (DP), and measure their ability to fend off real-world privacy attacks, in addition to measuring their core goal of providing accurate classifications. We establish an evaluation framework to ensure each of these implementations are fairly evaluated. Our selection of DP implementations add DP noise at different positions within the framework, either at the point of data collection/release, during updates while training of the model, or after training by perturbing learned model parameters. We evaluate each implementation across a range of privacy budgets and datasets, each implementation providing the same mathematical privacy guarantees. By measuring the models' resistance to real world attacks of membership and attribute inference, and their classification accuracy. We determine which implementations provide the most desirable tradeoff between privacy and utility. We found that the number of classes of a given dataset is unlikely to influence where the privacy and utility tradeoff occurs, a counter-intuitive inference in contrast to the known relationship of increased privacy vulnerability in datasets with more classes. Additionally, in the scenario that high privacy constraints are required, perturbing input training data before applying ML modeling does not trade off as much utility, as compared to noise added later in the ML process.
KW - attribute inference attack
KW - differential privacy attack
KW - machine learning
KW - membership inference attack
KW - privacy
KW - privacy attack
KW - tradeoff
KW - utility
UR - http://www.scopus.com/inward/record.url?scp=85097421931&partnerID=8YFLogxK
U2 - 10.1145/3411495.3421352
DO - 10.1145/3411495.3421352
M3 - Conference proceeding contribution
AN - SCOPUS:85097421931
T3 - CCSW 2020 - Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop
SP - 15
EP - 26
BT - CCSW 2020 - Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop
PB - Association for Computing Machinery, Inc
CY - New York, NY
Y2 - 9 November 2020 through 9 November 2020
ER -