|
ABSTRACT
Evolutionary clustering is an emerging research area essential to important applications such as clustering dynamic Web and blog contents and clustering data streams. In evolutionary clustering, a good clustering result should fit the current data well, while simultaneously not deviate too dramatically from the recent history. To fulfill this dual purpose, a measure of temporal smoothness is integrated in the overall measure of clustering quality. In this paper, we propose two frameworks that incorporate temporal smoothness in evolutionary spectral clustering. For both frameworks, we start with intuitions gained from the well-known k-means clustering problem, and then propose and solve corresponding cost functions for the evolutionary spectral clustering problems. Our solutions to the evolutionary spectral clustering problems provide more stable and consistent clustering results that are less sensitive to short-term noises while at the same time are adaptive to long-term cluster drifts. Furthermore, we demonstrate that our methods provide the optimal solutions to the relaxed versions of the corresponding evolutionary k-means clustering problems. Performance experiments over a number of real and synthetic data sets illustrate our evolutionary spectral clustering methods provide more robust clustering results that are not sensitive to noise and can adapt to data drifts.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Charu C. Aggarwal , Jiawei Han , Jianyong Wang , Philip S. Yu, A framework for clustering evolving data streams, Proceedings of the 29th international conference on Very large data bases, p.81-92, September 09-12, 2003, Berlin, Germany
|
| |
2
|
|
 |
3
|
|
 |
4
|
Moses Charikar , Chandra Chekuri , Tomás Feder , Rajeev Motwani, Incremental clustering and dynamic information retrieval, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.626-635, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258657]
|
| |
5
|
C. Chatfield. The Analysis of Time Series: An Introduction. Chapman & Hall/CRC.
|
| |
6
|
F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
K. Fan. On a theorem of Weyl concerning eigenvalues of linear transformations. In Proc. Natl. Acad. Sci., 1949.
|
| |
11
|
G. Golub and C. V. Loan. Matrix Computations. Johns Hopkins University Press, third edition, 1996.
|
| |
12
|
|
| |
13
|
C. Gupta and R. Grossman. Genic: A single pass generalized incremental algorithm for clustering. In SIAM Int. Conf. on Data Mining, 2004.
|
| |
14
|
L. J. Hubert and P. Arabie. Comparing partitions. Journal of Classification, 2, 1985.
|
 |
15
|
|
 |
16
|
|
| |
17
|
A. Ng, M. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS,2001.
|
| |
18
|
H. Ning, W. Xu, Y. Chi, Y. Gong, and T. Huang. Incremental spectral clustering with application to monitoring of evolving blog communities. In SIAM Int. Conf. on Data Mining, 2007.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
H. Zha, X. He, C. H. Q. Ding, M. Gu, and H. D. Simon. Spectral relaxation for k-means clustering. In NIPS, 2001.
|
CITED BY 6
|
|
Yu-Ru Lin , Yun Chi , Shenghuo Zhu , Hari Sundaram , Belle L. Tseng, Facetnet: a framework for analyzing communities and their evolutions in dynamic networks, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
Lei Tang , Huan Liu , Jianping Zhang , Zohreh Nazeri, Community evolution in dynamic multi-mode networks, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
Hanghang Tong , Spiros Papadimitriou , Jimeng Sun , Philip S. Yu , Christos Faloutsos, Colibri: fast mining of large static and dynamic graphs, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
Hanghang Tong , Yasushi Sakurai , Tina Eliassi-Rad , Christos Faloutsos, Fast mining of complex time-stamped events, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Xiuyao Song , Chris Jermaine , Sanjay Ranka , John Gums, A bayesian mixture model with linear regression mixing proportions, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|