|
ABSTRACT
Privacy and security concerns can prevent sharing of data, derailing data-mining projects. Distributed knowledge discovery, if done correctly, can alleviate this problem. We introduce a generalized privacy-preserving variant of the ID3 algorithm for vertically partitioned data distributed over two or more parties. Along with a proof of security, we discuss what would be necessary to make the protocols completely secure. We also provide experimental results, giving a first demonstration of the practical complexity of secure multiparty computation-based data mining.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
Atallah, M. J., Elmongui, H. G., Deshpande, V., and Schwarz, L. B. 2003. Secure supply-chain protocols. In Proceedings of the IEEE International Conference on E-Commerce, Newport Beach, CA. IEEE Computer Society Press, 293--302.
|
| |
5
|
Blake, C. and Merz, C. 1998. UCI repository of machine learning databases. http://citeseer.comp.nus.edu.sg/context/123650/0.
|
| |
6
|
|
| |
7
|
Cox, M. J., Engelschall, R. S., Henson, S., and rie, B. L. 1998--2005. The OpenSSL Toolkit.
|
| |
8
|
|
| |
9
|
Damgard, I., Jurik, M., and Nielsen, J. 2003. A generalization of Paillier's public-key system with applications to electronic voting.
|
 |
10
|
|
| |
11
|
|
| |
12
|
Duda, R. and Hart, P. E. 1973. Pattern Classification and Scene Analysis. John Wiley & Sons, Hoboken, NJ.
|
 |
13
|
Alexandre Evfimievski , Ramakrishnan Srikant , Rakesh Agrawal , Johannes Gehrke, Privacy preserving mining of association rules, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775080]
|
| |
14
|
Evidence-Based Medicine Working Group. 1992. Evidence-Based medicine. A new approach to teaching the practice of medicine. J. Amer. Medical Assoc. 268, 17 (Nov.), 2420--2425.
|
| |
15
|
Freedman, M. J., Nissim, K., and Pinkas, B. 2004. Efficient private matching and set intersection. In Proceedings of the 23rd Annual International Conference on the Theory and Applications of Cryptographic Techniques, International Association for Cryptologic Research (IACR), Interlaken, Switzerland. Springer, 1--19.
|
| |
16
|
Goethals, B., Laur, S., Lipmaa, H., and Mielikäinen, T. 2004. On secure scalar product computation for privacy-preserving data mining. In Proceedings of the 7th Annual International Conference in Information Security and Cryptology (ICISC), New York, C. Park and S. Chee, Eds. vol. 3506, Springer, 104--120.
|
| |
17
|
Goldreich, O. 2004. General Cryptographic Protocols, Vol. 2. In The Foundations of Cryptography, vol. 2. Cambridge University Press, Cambridge, UK, 599--764.
|
 |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
Kantarcioglu, M. and Clifton, C. 2002. Privacy-Preserving distributed mining of association rules on horizontally partitioned data. In Proceedings of the ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), Madison, WI. ACM Press, New York, 24--31.
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
Lewis, M. 2003. Department of defense appropriations act, 2004. Title VIII Section 8120. Enacted as Public Law 108-87.
|
| |
26
|
|
| |
27
|
|
| |
28
|
Lindell, Y. and Pinkas, B. 2002. Privacy preserving data mining. J. Cryptol. 15, 3, 177--206.
|
| |
29
|
|
| |
30
|
|
| |
31
|
Schneier, B. 1995. Applied Cryptography, 2nd ed. John Wiley & Sons, Hoboken, NJ.
|
| |
32
|
Shirao, K., Hoff, P., Ohtsu, A., Loehrer, P., Hyodo, I., Wadler, S., Wadleigh, R., O'Dwyer, P., Muro, K., Yamada, Y., Boku, N., Nagashima, F., and Abbruzzese, J. 2004. Comparison of the efficacy, toxicity, and pharmacokinetics of a uracil/tegafur (UFT) plus oral leucovorin (LV) regimen between Japanese and American patients with advanced colorectal cancer: Joint United States and Japan study of UFT/LV. J. Clinical Oncol. 22, 17 (Sept. 1), 3466--3474.
|
 |
33
|
|
 |
34
|
|
| |
35
|
Vaidya, J. and Clifton, C. 2004. Privacy preserving naïve Bayes classifier for vertically partitioned data. In Proceedings of the SIAM International Conference on Data Mining. SIAM, Philadelphia, PA, 522--526.
|
 |
36
|
|
| |
37
|
|
| |
38
|
Wang, K., Xu, Y., She, R., and Yu, P. S. 2006. Classification spanning private databases. In Proceedings of the 21st National Conference on Artificial Intelligence (AAAI). AAAI Press, Menlo Park, CA.
|
| |
39
|
|
 |
40
|
|
| |
41
|
|
REVIEW
"Richard CHBEIR : Reviewer"
Iterative dichotomiser 3 (ID3) is a classification algorithm that uses a fixed set of examples to build a decision tree. This paper presents an interesting variant of the ID3 algorithm that can be used to classify vertically partitioned data while
more...
|