ACM Home Page
Please provide us with feedback. Feedback
Composition attacks and auxiliary information in data privacy
Full text PdfPdf (530 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Research papers table of contents
Pages 265-273  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Srivatsava Ranjit Ganta  Pennsylvania State University, University Park, PA, USA
Shiva Prasad Kasiviswanathan  Pennsylvania State University, University Park, PA, USA
Adam Smith  Pennsylvania State University, University Park, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 318,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1401926
What is a DOI?

ABSTRACT

Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channels such as the web, public records, or domain knowledge. This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. Specifically, we investigate the effectiveness of current anonymization schemes in preserving privacy when multiple organizations independently release anonymized data about overlapping populations.

1. We investigate composition attacks, in which an adversary uses independent anonymized releases to breach privacy. We explain why recently proposed models of limited auxiliary information fail to capture composition attacks. Our experiments demonstrate that even a simple instance of a composition attack can breach privacy in practice, for a large class of currently proposed techniques. The class includes k-anonymity and several recent variants.

2. On a more positive note, certain randomization-based notions of privacy (such as differential privacy) provably resist composition attacks and, in fact, the use of arbitrary side information.This resistance enables "stand-alone" design of anonymization schemes, without the need for explicitly keeping track of other releases.

We provide a precise formulation of this property, and prove that an important class of relaxations of differential privacy also satisfy the property. This significantly enlarges the class of protocols known to enable modular design.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Official Statistics, special issue on disclosure limitation methods, Vol. 14, No. 4, 1998.
2
 
3
4
 
5
M. Barbaro and T. Zeller. A face is exposed for AOL searcher no. 4417749. The New York Times, Aug. 2006.
6
7
 
8
J.-W. Byun, Y. Sohn, E. Bertino, and N. Li. Secure anonymization for incremental datasets. In Secure Data Management, pages 48--63. Springer, 2006.
 
9
K. Chaudhuri and N. Mishra. When random sampling preserves privacy. In CRYPTO, pages 198--213. Springer, 2006.
 
10
11
 
12
 
13
C. Dwork. Differential privacy. In ICALP, pages 1--12. Springer, 2006.
 
14
C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486--503. Springer, 2006.
 
15
C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486--503. Springer, 2006.
 
16
C. Dwork and K. Nissim. Privacy-preserving datamining on vertically partitioned databases. In CRYPTO, pages 528--544. Springer, 2004.
17
18
19
 
20
S. R. Ganta, S. P. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. CoRR, arXiv:0803.0032v1 {cs.DB}, 2008.
 
21
S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Sciences, 28(2):270--299, 1984.
 
22
S. P. Kasiviswanathan and A. Smith. A note on differential privacy: Defining resistance to arbitrary side information. CoRR, arXiv:0803.39461 {cs.CR}, 2008.
 
23
24
 
25
N. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE, pages 106--115. IEEE Computer Society, 2007.
 
26
 
27
A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber. Privacy: From theory to practice on the map. In ICDE, pages 277--286. IEEE Computer Society, 2008.
28
 
29
D. J. Martin, D. Kifer, A. Machanavajjhala, J. Gehrke, and J. Y. Halpern. Worst-case background knowledge for privacy-preserving data publishing. In ICDE, pages 126--135. IEEE Computer Society, 2007.
 
30
31
 
32
 
33
 
34
UCI machine learning repository, http://www.ics.uci.edu/ mlearn/databases/.
 
35
A. van den Hout and P. van der Heijden. Randomized response, statistical disclosure control and misclassification: A review. International Statistical Review, 70:269--288, 2002.
36
 
37
S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63--69, 1965.
 
38
 
39
40
41
 
42


Collaborative Colleagues:
Srivatsava Ranjit Ganta: colleagues
Shiva Prasad Kasiviswanathan: colleagues
Adam Smith: colleagues