| Anonymizing transaction databases for publication |
| Full text |
Pdf
(363 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 767-775
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
Yabo Xu
|
Simon Fraser University, Burnaby, BC, Canada
|
|
Ke Wang
|
Simon Fraser University, Burnaby, BC, Canada
|
|
Ada Wai-Chee Fu
|
The Chinese University of Hong Kong, Hong Kong, Hong Kong
|
|
Philip S. Yu
|
University of Illinois at Chicago, Chicago, IL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 296, Citation Count: 3
|
|
|
ABSTRACT
This paper considers the problem of publishing "transaction data" for research purposes. Each transaction is an arbitrary set of items chosen from a large universe. Detailed transaction data provides an electronic image of one's life. This has two implications. One, transaction data are excellent candidates for data mining research. Two, use of transaction data would raise serious concerns over individual privacy. Therefore, before transaction data is released for data mining, it must be made anonymous so that data subjects cannot be re-identified. The challenge is that transaction data has no structure and can be extremely high dimensional. Traditional anonymization methods lose too much information on such data. To date, there has been no satisfactory privacy notion and solution proposed for anonymizing transaction data. This paper proposes one way to address this issue.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Barbaro, T. Zeller and S. Hansell. A Face Is Exposed for AOL Searcher No. 4417749. New York Times, Aug 9, 2006.
|
| |
2
|
E. Adar. User 4XXXXX9: Anonymizing Query Logs. Query Log Analysis Workshop, WWW 2007.
|
 |
3
|
Ravi Kumar , Jasmine Novak , Bo Pang , Andrew Tomkins, On anonymizing query logs via token-based hashing, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242657]
|
| |
4
|
|
| |
5
|
|
| |
6
|
Y. Saygin, V. S. Verykios, C. Clifton. Using Unknowns to Prevent Discovery of Association Rules, Conference on Research Issues in Data Engineering, 2002.
|
| |
7
|
F. Bonchi, F. Giannotti and D. Pedreschi. Blocking Anonymity Threats Raised by Frequent Itemset Mining. ICDM 2005.
|
 |
8
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
 |
9
|
Alexandre Evfimievski , Ramakrishnan Srikant , Rakesh Agrawal , Johannes Gehrke, Privacy preserving mining of association rules, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775080]
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
Sergey Brin , Rajeev Motwani , Craig Silverstein, Beyond market baskets: generalizing association rules to correlations, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.265-276, May 11-15, 1997, Tucson, Arizona, United States
|
 |
14
|
Eytan Adar , Daniel S. Weld , Brian N. Bershad , Steven S. Gribble, Why we search: visualizing and predicting user behavior, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242595]
|
| |
15
|
K. Hafner. Researchers Yearn to Use AOL Logs, but They Hesitate. New York Times, August 23, 2006.
|
 |
16
|
Lars Backstrom , Cynthia Dwork , Jon Kleinberg, Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242598]
|
| |
17
|
A. Narayanan and V. Shmatikov. How to Break Anonymity of the Netflix Prize Dataset. ArXiv Computer Science e-prints, October 2006.
|
 |
18
|
Hang Cui , Ji-Rong Wen , Jian-Yun Nie , Wei-Ying Ma, Probabilistic query expansion using query logs, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511489]
|
 |
19
|
|
| |
20
|
B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining. KDD 1998.
|
| |
21
|
|
CITED BY 3
|
|
|
|
|
|
|
|
Graham Cormode , Divesh Srivastava, Anonymized data: generation, models, usage, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|