ACM Home Page
Please provide us with feedback. Feedback
Privacy-preserving SVM using nonlinear kernels on horizontally partitioned data
Full text PdfPdf (194 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2006 ACM symposium on Applied computing table of contents
Dijon, France
SESSION: Data mining (DM) table of contents
Pages: 603 - 610  
Year of Publication: 2006
ISBN:1-59593-108-2
Authors
Hwanjo Yu  University of Iowa, Iowa City
Xiaoqian Jiang  University of Iowa, Iowa City
Jaideep Vaidya  Rutgers University, Newark
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 25,   Downloads (12 Months): 99,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1141277.1141415
What is a DOI?

ABSTRACT

Traditional Data Mining and Knowledge Discovery algorithms assume free access to data, either at a centralized location or in federated form. Increasingly, privacy and security concerns restrict this access, thus derailing data mining projects. What we need is distributed knowledge discovery that is sensitive to this problem. The key is to obtain valid results, while providing guarantees on the non-disclosure of data. Support vector machine classification is one of the most widely used classification methodologies in data mining and machine learning. It is based on solid theoretical foundations and has wide practical application. This paper proposes a privacy-preserving solution for support vector machine (SVM) classification, PP-SVM for short. Our solution constructs the global SVM classification model from the data distributed at multiple parties, without disclosing the data of each party to others. We assume that data is horizontally partitioned -- each party collects the same features of information for different data objects. We quantify the security and efficiency of the proposed method, and highlight future challenges.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
SPECT dataset. ftp://ftp/ics.uci.edu/pub/machine-learning-databases/spect/.
2
3
 
4
 
5
 
6
 
7
 
8
Directive 95/46/EC of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official Journal of the European Communities, No I.(281):31--50, Oct. 24 1995.
9
10
 
11
X. Ge. C++ code: SMO training of SVM. http://www.datalab.uci.edu/people/xge/svm/, 2001.
 
12
B. Goethals, S. Laur, H. Lipmaa, and T. Mielikäinen. On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In C. Park and S. Chee, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506, pages 104--120, December 2--3, 2004.
 
13
Standard for privacy of individually identifiable health information. Federal Register, 66(40), Feb. 28 2001.
 
14
 
15
X. Jiang and H. Yu. SVM-JAVA: A Java implementation of the SMO (sequential minimal optimization) for training SVM. Computer Science Department, University of Iowa, http://hwanjoyu.org/svm-java, 2005.
 
16
 
17
 
18
L. A. Kurgan, K. J. Cios, R. Tadeusiewicz, M. Ogiela, and L. S. Goodenday. Knowledge discovery approach to automated cardiac spect diagnosis. Artificial Intelligence in Medicine, 23:2:149--169, 2001.
 
19
 
20
 
21
Y. Lindell and B. Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177--206, 2002.
 
22
 
23
 
24
P. Ravikumar, W. W. Cohen, and S. E. Fienberg. A secure protocol for computing string distance metrics. In Proc. the Workshop on Privacy and Security Aspects of Data Mining at the Int. Conf. on Data Mining, 2004.
 
25
S. J. Rizvi and J. R. Haritsa. Maintaining data privacy in association rule mining. In Proceedings of 28th International Conference on Very Large Data Bases, pages 682--693, Hong Kong, Aug. 20--23 2002. VLDB.
 
26
27
28
 
29
J. Vaidya and C. Clifton. Privacy preserving naïve bayes classifier for vertically partitioned data. In 2004 SIAM International Conference on Data Mining, pages 522--526, 2004.
 
30
J. Vaidya and C. Clifton. Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 2005.
 
31
J. Vaidya and C. Clifton. Secure set intersection cardinality with application to association rule mining. Journal of Computer Security, 13(4), Nov. 2005.
 
32
V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons, 1998.
33
34
 
35
H. Yu, K. C. Chang, and J. Han. Heterogeneous learner for Web page classification. In Int. Conf. Data Mining (ICDM'2), 2002.
 
36
H. Yu and J. Vaidya. Privacy-preserving linear SVM classification. Submitted for publication, 2005.
 
37
H. Yu, J. Vaidya, and X. Jiang. Privacy preserving svm classification on vertically partitioned data. Submitted for publication, 2005.


Collaborative Colleagues:
Hwanjo Yu: colleagues
Xiaoqian Jiang: colleagues
Jaideep Vaidya: colleagues