ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Fast data anonymization with low information loss
Full text PdfPdf (704 KB)
Source
Very Large Data Bases archive
Proceedings of the 33rd international conference on Very large data bases table of contents
Vienna, Austria
SESSION: Research sessions: data privacy, anonymization, and outsourcing table of contents
Pages: 758-769  
Year of Publication: 2007
ISBN:978-1-59593-649-3
Authors
Gabriel Ghinita  National University of Singapor
Panagiotis Karras  University of Hong Kong
Panos Kalnis  National University of Singapor
Nikos Mamoulis  University of Hong Kong
Sponsors
ORACLE : ORACLE
: HP invent
: Yahoo! Research
IBM : IBM
: Intel
: Connex.cc
: SAP
Microsoft Research : Microsoft Research
: WKO
: Google
Publisher
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 139,   Citation Count: 10
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  

Warning: The download time has expired please click on the item to try again.


ABSTRACT

Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy preserving paradigms of k-anonymity and l-diversity. k-anonymity protects against the identification of an individual's record. l-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches suffer from at least one of the following drawbacks: (i) The information loss metrics are counter-intuitive and fail to capture data inaccuracies inflicted for the sake of privacy. (ii) l-diversity is solved by techniques developed for the simpler k-anonymity problem, which introduces unnecessary inaccuracies. (iii) The anonymization process is inefficient in terms of computation and I/O cost.

In this paper we propose a framework for efficient privacy preservation that addresses these deficiencies. First, we focus on one-dimensional (i.e., single attribute) quasi-identifiers, and study the properties of optimal solutions for k-anonymity and l-diversity, based on meaningful information loss metrics. Guided by these properties, we develop efficient heuristics to solve the one-dimensional problems in linear time. Finally, we generalize our solutions to multi-dimensional quasi-identifiers using space-mapping techniques. Extensive experimental evaluation shows that our techniques clearly outperform the state-of-the-art, in terms of execution time and information loss.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. Approximation Algorithms for k-Anonymity. Journal of Privacy Technology, (Paper number: 20051120001), 2005.
 
3
4
 
5
A. Froomkin. The Death of Privacy. Stanford Law Review, 52(5):1461--1543, 2000.
6
7
8
9
 
10
11
 
12
N. Li, T. Li, and S. Venkatasubramanian. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In Proc. of ICDE, pp 106--115, 2007.
 
13
14
 
15
16
 
17
18
 
19
R. Wong, A. Fu, J. Pei, K. Wang, S. Wan, and C. Lo. Multidimensional k-anonymization by Linear Clustering Using Space-filling Curves. TR 2006-27, Simon Fraser University, March 2006.
 
20
21
 
22
Q. Zhang, N. Koudas, D. Srivastava, and T. Yu. Aggregate Query Answering on Anonymized Tables. In Proc. of ICDE, pp 116--125, 2007.
23

CITED BY  10
Collaborative Colleagues:
Gabriel Ghinita: colleagues
Panagiotis Karras: colleagues
Panos Kalnis: colleagues
Nikos Mamoulis: colleagues