ACM Home Page
Please provide us with feedback. Feedback
Ranking-based clustering of heterogeneous information networks with star network schema
Full text MovMov (23:09),  PdfPdf (471 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 797-806  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Yizhou Sun  University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
Yintao Yu  University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
Jiawei Han  University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 38,   Downloads (12 Months): 143,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557107
What is a DOI?

ABSTRACT

A heterogeneous information network is an information network

composed of multiple types of objects. Clustering on such a network may lead to better understanding of both hidden structures of the network and the individual role played by every object in each cluster. However, although clustering on homogeneous networks has been studied over decades, clustering on heterogeneous networks has not been addressed until recently.

A recent study proposed a new algorithm, RankClus, for clustering on bi-typed heterogeneous networks. However, a real-world network may consist of more than two types, and the interactions among multi-typed objects play a key role at disclosing the rich semantics that a network carries. In this paper, we study clustering of multi-typed heterogeneous networks with a star network schema and propose a novel algorithm, NetClus, that utilizes links across multityped objects to generate high-quality net-clusters. An iterative enhancement method is developed that leads to effective ranking-based clustering in such heterogeneous networks. Our experiments on DBLP data show that NetClus generates more accurate clustering results than the baseline topic model algorithm PLSA and the recently proposed algorithm, RankClus. Further, NetClus generates informative clusters, presenting good ranking and cluster membership information for each attribute object in each net-cluster.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Banerjee, S. Basu, and S. Merugu. Multi-way clustering on relation graphs. In Proceedings of the 7th SIAM International Conference on Data Mining SIAM'07, 2007.
2
 
3
 
4
5
 
6
T. Hofmann. Probabilistic latent semantic analysis. In In Proc. of Uncertainty in Artificial Intelligence (UAI'99)UAI'99, pages 289--296, 1999.
7
8
9
10
 
11
M. E. J. Newman. The structure of scientific collaboration networks. Working Papers 00-07-037, Santa Fe Institute, July 2000.
 
12
M. E. J. Newman. Assortative mixing in networks. Physical Review Letters, 89(20):208701, October 2002.
13
 
14
15
16
17
 
18
U. von Luxburg. A tutorial on spectral clustering. Technical report, Max Planck Institute for Biological Cybernetics, 2006.
 
20
S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In Proceedings of the Fifth SIAM International Conference on Data Mining (SDM'05)SDM'05, 2005.
21
22
23
190

Collaborative Colleagues:
Yizhou Sun: colleagues
Yintao Yu: colleagues
Jiawei Han: colleagues