ACM Home Page
Please provide us with feedback. Feedback
Privacy-preserving data mining
Full text PdfPdf (932 KB)
Source International Conference on Management of Data archive
Proceedings of the 2000 ACM SIGMOD international conference on Management of data table of contents
Dallas, Texas, United States
Pages: 439 - 450  
Year of Publication: 2000
ISBN:1-58113-217-4
Also published in ...
Authors
Rakesh Agrawal  IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Ramakrishnan Srikant  IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 158,   Downloads (12 Months): 846,   Citation Count: 232
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/342009.335438
What is a DOI?

ABSTRACT

A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Specifically, we address the following question. Since the primary task in data mining is the development of models about aggregated data, can we develop accurate models without access to precise information in individual data records? We consider the concrete case of building a decision-tree classifier from training data in which the values of individual records have been perturbed. The resulting data records look very different from the original records and the distribution of data values is also very different from the original distribution. While it is not possible to accurately estimate original values in individual data records, we propose a novel reconstruction procedure to accurately estimate the distribution of original data values. By using these reconstructed distributions, we are able to build classifiers whose accuracy is comparable to the accuracy of classifiers built with the original data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

AC99
 
AGI+92
 
Agr99
Rakesh Agrawal. Data Mining: Crossing the Chasm. In 5th Int'l Con}erence on Knowledge Discovery in Databases and Data Mining, San Diego, California, August 1999. Available from http ://www. almaden, ibm. eom/cs/quese / papers/kdd99_chasm, pp#.
AW89
 
BDF+97
D. Barbara, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. Ioatmidis, it. V. Jagadish, T. Johnson, R.Ng, V. Poosala, and K. Sevcik. The New Jersey Data Reduction Report. Data Bngrg. Bull., 20:3-45, Dec. 1997.
Bec80
Ben99
 
BFOS84
L. Breiman, J. H, Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, Belmont, 1984.
BS97
 
CM96
C. Clifton and D. Marks. Security and privacy implications of data mining. In ACId SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pages 15-19, May 1996.
 
CO82
F.Y. Chin and G. O#soyoglu. Auditing and infrence control in statistical databases. IEBE Trans. Sof~w. Eng., SE-8(6):113-139, April 1982.
 
Cox80
L.H. Cox. Suppression methodology and statistical disclosure control, or. Am. Star. Assoc., 75(370):377-395, April 1980.
 
Cra46
H. Cramer. Mathematical Methods o{ Statistics. Princeton University Press, 1946.
 
CRA99a
L.F. Cranor, J. Reagle, and M.S. Ackerman. Beyond concern: Understanding net users' attitudes about online privacy. Technical Report TR 99.4.3, AT&T Labs-Research, April 1999. Available from http://www, research.art, cam/ library/trs/TRs/99/99.4/99.4.3/report, him.
Cra99b
CS76
DDS79
Den80
 
Den82
 
Din78
DJL79
 
ECB99
 
Eco99
The Economist. The End of Privacy, May 1999.
 
EHN96
H.W. Engl, M. Hanke, and A. Neubaue. Regularization of Inverse Problems. Kluwer, 1996.
 
eu998
The European Union's Directive on Privacy Protection, October 1998. Available from hetp: I/.... echo. lu/l egal/en/dat aprot/ dSrectiv/direct iv. html.
 
Fel72
I.P. FeUegi. On the question of statistical confidentiality2# I. Am. Star. Assoc., 67(337):7- 18, March 1972.
 
Fis63
Marek Fisz. Probability #heory and Mathematical Statistics. Wiley, 1963:
 
FJS97
 
GWB97
 
HE98
C. Hine and J. Eve. Privacy in 'the marketplace. The ln:ormation Society, L42(2):#6-59, 1998.
 
HS99
John Hagel and Moxc Singer. Net Worth. Harvard Business School Press, 1999.
LCL85
LEW99
 
LM99
J.B. Lotspiech and R.J.T. Morris. Method and system for client/server communications with user information revealed as a function of willingness to reveal and whether the information is required. U.S. Patent No. 5913030, June 1999.
 
LST83
 
MAR96
 
MST94
 
Off98
Office of the Information and Privacy Commissioner, Ontario. Data Mining: Staking a Claim or, Your Privacy, January 1998. Available from http:{/,w,.ipc,on.ca/ web.#ite, eng/mat t ers / s ttm#pap /papers { dat amine .htm.
Opp97
 
Qui93
Rei84
 
RG98
 
SAM96
 
Sho82
 
ST90
 
The98
Kurt Thearling. Data mining and privacy: A conflict in making. DS*, March 1998.
 
Tim97
Time. The Death of Privacy, August 1997.
TYW84
 
War65
S.L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. J. Am. Star. Assoc., 60(309):63-69, March 1965.
 
Wes98a
A.F. Westin. E-commerce and privacy: What net uzers want. Technical report, Louis Harris & Associates, June 1998. Available from http ://www. pri racy ex change, org/iss/ surveys / ec ommsum, html.
 
Wes98b
A.F. Westin. Priwcy concerns & consumer choice. Technical report, Louis Harris & Associates, Dec. 1998. Available from http ://www. privacyexchange, org/iss/ surveys/1298#oc, html.
 
Wes99
A.F. Westin. Freebies and privacy: What net users think. Technical report, Opinion Research Corporation, July 1999. Available from http : //www. privacyexahange, org/iss/ surveys/st990714, html.
 
Wor
The World Wide Web Consortium. The Plat}orm for Privacy Preference (P3P). Available from http: //www. w3. org/P3P/P3FAQ, html.
YC77

CITED BY  233

Collaborative Colleagues:
Rakesh Agrawal: colleagues
Ramakrishnan Srikant: colleagues