| Ranking-based clustering of heterogeneous information networks with star network schema |
| Full text |
Mov
(23:09),
Pdf
(471 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Paris, France
SESSION: Research track papers
table of contents
Pages 797-806
Year of Publication: 2009
ISBN:978-1-60558-495-9
|
|
Authors
|
|
Yizhou Sun
|
University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
|
|
Yintao Yu
|
University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
|
|
Jiawei Han
|
University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 38, Downloads (12 Months): 143, Citation Count: 0
|
|
|
ABSTRACT
A heterogeneous information network is an information network composed of multiple types of objects. Clustering on such a network may lead to better understanding of both hidden structures of the network and the individual role played by every object in each cluster. However, although clustering on homogeneous networks has been studied over decades, clustering on heterogeneous networks has not been addressed until recently. A recent study proposed a new algorithm, RankClus, for clustering on bi-typed heterogeneous networks. However, a real-world network may consist of more than two types, and the interactions among multi-typed objects play a key role at disclosing the rich semantics that a network carries. In this paper, we study clustering of multi-typed heterogeneous networks with a star network schema and propose a novel algorithm, NetClus, that utilizes links across multityped objects to generate high-quality net-clusters. An iterative enhancement method is developed that leads to effective ranking-based clustering in such heterogeneous networks. Our experiments on DBLP data show that NetClus generates more accurate clustering results than the baseline topic model algorithm PLSA and the recently proposed algorithm, RankClus. Further, NetClus generates informative clusters, presenting good ranking and cluster membership information for each attribute object in each net-cluster.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A. Banerjee, S. Basu, and S. Merugu. Multi-way clustering on relation graphs. In Proceedings of the 7th SIAM International Conference on Data Mining SIAM'07, 2007.
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
 |
5
|
Michalis Faloutsos , Petros Faloutsos , Christos Faloutsos, On power-law relationships of the Internet topology, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.251-262, August 30-September 03, 1999, Cambridge, Massachusetts, United States
|
| |
6
|
T. Hofmann. Probabilistic latent semantic analysis. In In Proc. of Uncertainty in Artificial Intelligence (UAI'99)UAI'99, pages 289--296, 1999.
|
 |
7
|
|
 |
8
|
|
 |
9
|
Bo Long , Zhongfei (Mark) Zhang , Xiaoyun Wú , Philip S. Yu, Spectral clustering for multi-type relational data, Proceedings of the 23rd international conference on Machine learning, p.585-592, June 25-29, 2006, Pittsburgh, Pennsylvania
[doi> 10.1145/1143844.1143918]
|
 |
10
|
|
| |
11
|
M. E. J. Newman. The structure of scientific collaboration networks. Working Papers 00-07-037, Santa Fe Institute, July 2000.
|
| |
12
|
M. E. J. Newman. Assortative mixing in networks. Physical Review Letters, 89(20):208701, October 2002.
|
 |
13
|
|
| |
14
|
|
 |
15
|
Mark Steyvers , Padhraic Smyth , Michal Rosen-Zvi , Thomas Griffiths, Probabilistic author-topic models for information discovery, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014087]
|
 |
16
|
Yizhou Sun , Jiawei Han , Peixiang Zhao , Zhijun Yin , Hong Cheng , Tianyi Wu, RankClus: integrating clustering with ranking for heterogeneous information network analysis, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
[doi> 10.1145/1516360.1516426]
|
 |
17
|
|
| |
18
|
U. von Luxburg. A tutorial on spectral clustering. Technical report, Max Planck Institute for Biological Cybernetics, 2006.
|
| |
20
|
S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In Proceedings of the Fifth SIAM International Conference on Data Mining (SDM'05)SDM'05, 2005.
|
 |
21
|
Xiaowei Xu , Nurcan Yuruk , Zhidan Feng , Thomas A. J. Schweiger, SCAN: a structural clustering algorithm for networks, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281280]
|
 |
22
|
|
 |
23
|
|
 |
190
|
|
|