| k-means projective clustering |
| Full text |
Pdf
(319 KB)
|
| Source
|
Symposium on Principles of Database Systems
archive
Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
table of contents
Paris, France
SESSION: Clustering, data mining, approximations
table of contents
Pages: 155 - 165
Year of Publication: 2004
ISBN:158113858X
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 96, Citation Count: 10
|
|
|
ABSTRACT
In many applications it is desirable to cluster high dimensional data along various subspaces, which we refer to as projective clustering. We propose a new objective function for projective clustering, taking into account the inherent trade-off between the dimension of a subspace and the induced clustering error. We then present an extension of the k-means clustering algorithm for projective clustering in arbitrary subspaces, and also propose techniques to avoid local minima. Unlike previous algorithms, ours can choose the dimension of each cluster independently and automatically. Furthermore, experimental results show that our algorithm is significantly more accurate than the previous approaches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
 |
4
|
Charu C. Aggarwal , Joel L. Wolf , Philip S. Yu , Cecilia Procopiuc , Jong Soo Park, Fast algorithms for projected clustering, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.61-72, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
5
|
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.94-105, June 01-04, 1998, Seattle, Washington, United States
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
Sudipto Guha , Rajeev Rastogi , Kyuseok Shim, CURE: an efficient clustering algorithm for large databases, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.73-84, June 01-04, 1998, Seattle, Washington, United States
|
| |
10
|
|
| |
11
|
|
| |
12
|
R. M. Heiberger, Algorithm AS 127: Generation of random orthogonal matrices, Appl. Statist., 27 (1978), 199--206.
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
W. Johnson and J. Lindenstrauss, Extensions of Lipschitz maps into a Hilbert space, Contemp. Math., 26 (1984), 189--206.
|
| |
17
|
T. T. Jolliffe, Principal component analysis, Springer-Verlag, New York, 2002.
|
| |
18
|
|
 |
19
|
|
 |
20
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
CITED BY 11
|
|
Aristides Gionis , Alexander Hinneburg , Spiros Papadimitriou , Panayiotis Tsaparas, Dimension induced clustering, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
Amit Deshpande , Luis Rademacher , Santosh Vempala , Grant Wang, Matrix approximation and projective clustering via volume sampling, Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, p.1117-1126, January 22-26, 2006, Miami, Florida
|
|
|
|
|
|
|
|
|
|
|
|
Thierry Urruty , Stanislas Lew , Nacim Ihadaddene , Dan A. Simovici, Detecting eye fixations by projection clustering, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), v.3 n.4, p.1-20, December 2007
|
|
|
A. Ciaramella , S. Cocozza , F. Iorio , G. Miele , F. Napolitano , M. Pinelli , G. Raiconi , R. Tagliaferri, 2008 Special Issue: Interactive data analysis and clustering of genomic data, Neural Networks, v.21 n.2-3, p.368-378, March, 2008
|
|
|
|
|
|
|
|
|
|
|
|
Ida Bifulco , Carmine Fedullo , Francesco Napolitano , Giancarlo Raiconi , Roberto Tagliaferri, Multiple data structure discovery through global optimisation, meta clustering and consensus methods, International Journal of Knowledge Engineering and Soft Data Paradigms, v.1 n.4, p.300-317, October 2009
|
|