|
ABSTRACT
Advances in computer networking and database technologies have enabled the collection and storage of vast quantities of data. Data mining can extract valuable knowledge from this data, and organizations have realized that they can often obtain better results by pooling their data together. However, the collected data may contain sensitive or private information about the organizations or their customers, and privacy concerns are exacerbated if data is shared between multiple organizations.Distributed data mining is concerned with the computation of models from data that is distributed among multiple participants. Privacy-preserving distributed data mining seeks to allow for the cooperative computation of such models without the cooperating parties revealing any of their individual data items. Our paper makes two contributions in privacy-preserving data mining. First, we introduce the concept of arbitrarily partitioned data, which is a generalization of both horizontally and vertically partitioned data. Second, we provide an efficient privacy-preserving protocol for k-means clustering in the setting of arbitrarily partitioned data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
S. Benninga and B. Czaczkes. Financial Modelling. MIT Press, 1997.
|
| |
4
|
R. R. de Carvalho, S. G. Djorgovski, N. Weir, U. Fayyad, J. Roden, A. Gray, and K. Cherkauer. Applications of clustering analysis and unsupervised classification algorithms to digitized POSS-II. Bulletin of the American Astronomical Society, 26:1372, December 1994.
|
| |
5
|
E. Forgey. Cluster analysis of multivariate data: Efficiency vs. interpretability of classification. Biometrics, 21:768, 1965.
|
| |
6
|
B. Goethals, S. Laur, H. Lipmaa, and T. Mielikainen. On secure scalar product computation for privacy-preserving data mining. In The 7th Annual International Conf. in Information Security and Cryptology, 2004.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
S. P. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28:129--137, 1982.
|
| |
11
|
J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281--296, 1967.
|
| |
12
|
|
| |
13
|
|
| |
14
|
S. Oliveira and O. R. Zaïane. Privacy preserving clustering by data transformation. In Proc. 18th Brazilian Symposium on Databases, pages 304--318, 2003.
|
| |
15
|
Frank De Smet, Janick Mathys, Kathleen Marchal, Gert Thijs, Bart De Moor, and Yves Moreau. Adaptive quality-based clustering of gene expression profiles. Bioinformatics, 18(5):735--746, 2002.
|
 |
16
|
|
| |
17
|
O. Veksler. Image segmentation by nested cuts. In Proc. of IEEE Computer Vision and Pattern Recognition, pages 339--344, 2000.
|
| |
18
|
A. C.-C. Yao. How to generate and exchange secrets. In Proc. 27th IEEE Symp. on Foundations of Computer Science, pages 162--167, 1986.
|
CITED BY 13
|
|
|
|
|
V. Kapoor , P. Poncelet , F. Trousset , M. Teisseire, Privacy preserving sequential pattern mining in distributed databases, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
Zekeriya Erkin , Alessandro Piva , Stefan Katzenbeisser , R. L. Lagendijk , Jamshid Shokrollahi , Gregory Neven , Mauro Barni, Protection and retrieval of encrypted multimedia content: when cryptography meets signal processing, EURASIP Journal on Information Security, v.7 n.2, p.1-20, January 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|