|
ABSTRACT
Heterogeneous data co-clustering has attracted more and more attention in recent years due to its high impact on various applications. While the co-clustering algorithms for two types of heterogeneous data (denoted by pair-wise co-clustering), such as documents and terms, have been well studied in the literature, the work on more types of heterogeneous data (denoted by high-order co-clustering) is still very limited. As an attempt in this direction, in this paper, we worked on a specific case of high-order co-clustering in which there is a central type of objects that connects the other types so as to form a star structure of the inter-relationships. Actually, this case could be a very good abstract for many real-world applications, such as the co-clustering of categories, documents and terms in text mining. In our philosophy, we treated such kind of problems as the fusion of multiple pair-wise co-clustering sub-problems with the constraint of the star structure. Accordingly, we proposed the concept of consistent bipartite graph co-partitioning, and developed an algorithm based on semi-definite programming (SDP) for efficient computation of the clustering results. Experiments on toy problems and real data both verified the effectiveness of our proposed method.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bach, F.R., and Jordan, M.I. Learning spectral clustering. Neural Info. Processing Systems 16 (NIPS 2003), 2003.
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
Frenk, J.B.G., and Schaible, S. Fractional Programming. ERIM Report Series Reference No. ERS-2004-074-LIS. http://ssrn.com/abstract=595012.
|
| |
9
|
Fujisawa, K., Fukuda, M., Kojima, M., and Nakata, K. Numerical evaluation of the SDPA (SemiDefinite Programming Algorithm). High Performance Optimization, Kluwer Academic Press, 267--301, 2000.
|
| |
10
|
Bin Gao , Tie-Yan Liu , Guang Feng , Tao Qin , Qian-Sheng Cheng , Wei-Ying Ma, Hierarchical Taxonomy Preparation for Text Categorization Using Consistent Bipartite Spectral Graph Copartitioning, IEEE Transactions on Knowledge and Data Engineering, v.17 n.9, p.1263-1273, September 2005
[doi> 10.1109/TKDE.2005.147]
|
| |
11
|
Golub, G.H., and Loan, C.F.V. Matrix computations. Johns Hopkins University Press, 3rd edition, 1996.
|
| |
12
|
Hagen, L., and Kahng, A.B. New spectral methods for ratio cut partitioning and clustering. IEEE. Trans. on Computed Aided Desgin, 11:1074--1085, 1992.
|
| |
13
|
Klerk, E. Aspects of Semidefinite Programming: Interior Point Algorithms and Selected Applications. Applied Optimization Series, Volume 65. Kluwer Academic Publishers, March 2002, 300 pp., ISBN 1-4020-0547-4.
|
| |
14
|
Kluger, Y., Basri, R., Chang, J.T., and Gerstein, M. Spectral biclustering of microarray cancer data: co-clustering genes and conditions. Genome Res., Apr 2003; 13: 703--716.
|
| |
15
|
|
| |
16
|
Monteiro, R.D.C. First- and Second-Order Methods for Semidefinite Programming. Georgia Tech, January 2003.
|
| |
17
|
Pardalos, P.M. and Wolkowicz, H. Topics in Semidefinite and Interior Point Methods. Fields Institute Communications 18, AMS, Providence, Rhode Island, 1998.
|
| |
18
|
|
| |
19
|
|
| |
20
|
SDPA Online for your future. http://grid.r.dendai.ac.jp/sdpa/.
|
| |
21
|
Semidefinite Programming. http://www-user.tu-chemnitz.de/~helmberg/semidef.html.
|
| |
22
|
|
 |
23
|
Jidong Wang , Huajun Zeng , Zheng Chen , Hongjun Lu , Li Tao , Wei-Ying Ma, ReCoM: reinforcement clustering of multi-type interrelated data objects, Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, July 28-August 01, 2003, Toronto, Canada
[doi> 10.1145/860435.860486]
|
| |
24
|
|
| |
25
|
|
 |
26
|
Hongyuan Zha , Xiaofeng He , Chris Ding , Horst Simon , Ming Gu, Bipartite graph partitioning and data clustering, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502591]
|
CITED BY 12
|
|
Bin Gao , Tie-Yan Liu , Tao Qin , Xin Zheng , Qian-Sheng Cheng , Wei-Ying Ma, Web image clustering by consistent utilization of visual features and surrounding texts, Proceedings of the 13th annual ACM international conference on Multimedia, November 06-11, 2005, Hilton, Singapore
|
|
|
Bo Long , Zhongfei (Mark) Zhang , Xiaoyun Wú , Philip S. Yu, Spectral clustering for multi-type relational data, Proceedings of the 23rd international conference on Machine learning, p.585-592, June 25-29, 2006, Pittsburgh, Pennsylvania
|
|
|
Bo Long , Xiaoyun Wu , Zhongfei (Mark) Zhang , Philip S. Yu, Unsupervised learning on k-partite graphs, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
Yang Song , Ziming Zhuang , Huajing Li , Qiankun Zhao , Jia Li , Wang-Chien Lee , C. Lee Giles, Real-time automatic tag recommendation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
Lei Tang , Huan Liu , Jianping Zhang , Zohreh Nazeri, Community evolution in dynamic multi-mode networks, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
Huajing Li , Zaiqing Nie , Wang-Chien Lee , Lee Giles , Ji-Rong Wen, Scalable community discovery on textual data with relations, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Bo Long , Mark Zhang , Philip S. Yu , Tianbing Xu, Clustering on complex graphs, Proceedings of the 23rd national conference on Artificial intelligence, p.659-664, July 13-17, 2008, Chicago, Illinois
|
|