|
ABSTRACT
In recent years, document clustering has been receiving more and more attentions as an important and fundamental technique for unsupervised document organization, automatictopic extraction, and fast information retrieval or filtering. In this paper, we propose a novel method for clustering documents using regularization. Unlike traditional globally regularized clustering methods, our method first construct a local regularized linear label predictor for each document vector, and then combine all those local regularizers with a global smoothness regularizer. So we call our algorithm Clustering with Local and Global Regularization (CLGR). We will show that the cluster memberships of the documents can be achieved by eigenvalue decomposition of a sparse symmetric matrix, which can be efficiently solved by iterative methods. Finally our experimental evaluations on several datasets are presented to show the superiorities of CLGR over traditional document clustering methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
M. Belkin and P. Niyogi. Towards a Theoretical Foundation for Laplacian-Based Manifold Methods. In Proceedings of the 18th Conference on Learning Theory (COLT). 2005.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
P. K. Chan, D. F. Schlag and J. Y. Zien. Spectral K-way Ratio-Cut Partitioning and Clustering. IEEE Trans. Computer-Aided Design, 13:1088--1096, Sep. 1994.
|
 |
8
|
Douglass R. Cutting , David R. Karger , Jan O. Pedersen , John W. Tukey, Scatter/Gather: a cluster-based approach to browsing large document collections, Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, p.318-329, June 21-24, 1992, Copenhagen, Denmark
[doi> 10.1145/133160.133214]
|
| |
9
|
|
| |
10
|
C. Ding, X. He, and H. Simon. On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the SIAM Data Mining Conference, 2005.
|
| |
11
|
|
 |
12
|
Chris Ding , Tao Li , Wei Peng , Haesun Park, Orthogonal nonnegative matrix t-factorizations for clustering, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150420]
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
Eui-Hong Han , Daniel Boley , Maria Gini , Robert Gross , Kyle Hastings , George Karypis , Vipin Kumar , Bamshad Mobasher , Jerome Moore, WebACE: a Web agent for document categorization and exploration, Proceedings of the second international conference on Autonomous agents, p.408-415, May 10-13, 1998, Minneapolis, Minnesota, United States
[doi> 10.1145/280765.280872]
|
| |
18
|
M. Hein, J. Y. Audibert, and U. von Luxburg. From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians. In Proceedings of the 18th Conference on Learning Theory (COLT), 470--485. 2005.
|
| |
19
|
J. He, M. Lan, C. -L. Tan, S. -Y. Sung, and H. -B. Low. Initialization of Cluster Refinement Algorithms: A Review and Comparative Study. In Proceedings of International Joint Conference on Neural Networks, 2004.
|
| |
20
|
A. Y. Ng, M. I. Jordan, Y. Weiss. On Spectral Clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14. 2002.
|
| |
21
|
B. SchÄolkopf and A. Smola. Learning with Kernels. The MIT Press. Cambridge, Massachusetts. 2002.
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
Wu, M. and SchÄolkopf, B. A Local Learning Approach for Clustering. In Advances in Neural Information Processing Systems 18. 2006.
|
| |
26
|
|
 |
27
|
|
| |
28
|
H. Zha, X. He, C. Ding, M. Gu and H. Simon. Spectral Relaxation for K-means Clustering. In NIPS 14. 2001.
|
| |
29
|
|
| |
30
|
L. Zelnik-Manor and P. Perona. Self-Tuning Spectral Clustering. In NIPS 17. 2005.
|
| |
31
|
D. Zhou, O. Bousquet, T. N. Lal, J. Weston and B. Scholkopf. Learning with Local and Global Consistency. NIPS 17, 2005.
|
CITED BY
|
|
Fei Wang , Changshui Zhang , Tao Li, Clustering with local and global regularization, Proceedings of the 22nd national conference on Artificial intelligence, p.657-662, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|