ACM Home Page
Please provide us with feedback. Feedback
Attacks on privacy and deFinetti's theorem
Full text PdfPdf (478 KB)
Source
International Conference on Management of Data archive
Proceedings of the 35th SIGMOD international conference on Management of data table of contents
Providence, Rhode Island, USA
SESSION: Research session 4: security II table of contents
Pages 127-138  
Year of Publication: 2009
ISBN:978-1-60558-551-2
Author
Daniel Kifer  The Pennsylvania State University, University Park, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 45,   Downloads (12 Months): 208,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1559845.1559861
What is a DOI?

ABSTRACT

In this paper we present a method for reasoning about privacy using the concepts of exchangeability and deFinetti's theorem. We illustrate the usefulness of this technique by using it to attack a popular data sanitization scheme known as Anatomy. We stress that Anatomy is not the only sanitization scheme that is vulnerable to this attack. In fact, any scheme that uses the random worlds model, i.i.d. model, or tuple-independent model needs to be re-evaluated.

The difference between the attack presented here and others that have been proposedin the past is that we do not need extensive background knowledge. An attacker only needs to know the nonsensitive attributes of one individual in the data, and can carry out this attack just by building a machine learning model over the sanitized data. The reason this attack is successful is that it exploits a subtle flaw in the way prior work computed the probability of disclosure of a sensitive attribute. We demonstrate this theoretically, empirically, and with intuitive examples. We also discuss how this generalizes to many other privacy schemes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
A. Asuncion and D.J. Newman. UCI machine learning repository, 2007.
 
4
F. Bacchus, A. J. Grove, J. Y. Halpern, and D. Koller. From statistics to beliefs. In AAAI, 1992.
5
 
6
Michael Barbaro and Tom Zeller. A face is exposed for AOL searcher no. 4417749. New York Times, August 9 2006.
 
7
 
8
Yvonne M. Bishop, Stephen E. Fienberg, and Paul W. Holland. Discrete Multivariate Analysis: Theory and Practice. Springer, 2007.
 
9
 
10
 
11
 
12
Nando de Freitas and Hendrik Kück. Learning about individuals from group statistics. In UAI, pages 332--339, 2005.
 
13
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1--38, 1977.
 
14
Persi Diaconis and David Freedman. De finetti's generalizations of exchangeability. In Richard C. Jeffrey, editor, Studies in Inductive Logic and Probability, Volume II, pages 233--249. University of California Press, 1980.
 
15
16
 
17
Cynthia Dwork. Differential privacy. In ICALP, 2006.
 
18
19
 
20
 
21
Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, and Adam Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008.
 
22
 
23
24
 
25
George H. John and Pat Langley. Estimating continuous distributions in bayesian classifiers. In UAI, 1995.
 
26
27
 
28
29
30
31
 
32
 
33
N. Li, T. Li, and S. Venkatasubramanian. t-Closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, 2007.
 
34
35
 
36
David Martin, Daniel Kifer, Ashwin Machanavajjhala, Johannes Gehrke, and Joseph Halpern. Worst case background knowledge for privacy preserving data publishing. In ICDE, 2007.
37
38
 
39
Arvind Narayanan and Vitaly Shmatikov. How to break anonymity of the net ix prize dataset. http://arxiv.org/abs/cs/0610105, 2006.
 
40
41
 
42
 
43
Jason Rennie, Lawrence Shih, Jaime Teevan, and David R. Karger. Tackling the poor assumptions of naive bayes text classifiers. In ICML, 2003.
 
44
 
45
Mark J. Schervish. Theory of Statistics. Springer, 1995.
 
46
 
47
L. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 1979.
 
48
 
49
 
50