ACM Home Page
Please provide us with feedback. Feedback
Combining link and content for community detection: a discriminative approach
Full text MovMov (14:58),  PdfPdf (469 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 927-936  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Tianbao Yang  Michigan State University, East Lansing, MI, USA
Rong Jin  Michigan State University, East Lansing, MI, USA
Yun Chi  NEC Laboratories America, Cupertino, CA, USA
Shenghuo Zhu  NEC Laboratories America, Cupertino, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 93,   Downloads (12 Months): 215,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557120
What is a DOI?

ABSTRACT

In this paper, we consider the problem of combining link and content analysis for community detection from networked data, such as paper citation networks and Word Wide Web. Most existing approaches combine link and content information by a generative model that generates both links and contents via a shared set of community memberships. These generative models have some shortcomings in that they failed to consider additional factors that could affect the community memberships and isolate the contents that are irrelevant to community memberships. To explicitly address these shortcomings, we propose a discriminative model for combining the link and content analysis for community detection. First, we propose a conditional model for link analysis and in the model, we introduce hidden variables to explicitly model the popularity of nodes. Second, to alleviate the impact of irrelevant content attributes, we develop a discriminative model for content analysis. These two models are unified seamlessly via the community memberships. We present efficient algorithms to solve the related optimization problems based on bound optimization and alternating projection. Extensive experiments with benchmark data sets show that the proposed framework significantly outperforms the state-of-the-art approaches for combining link and content analysis for community detection.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
]]E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic block models for relational data with application to protein-protein interactions. In IBS, 2006.
 
3
]]J. Baumes, M. Goldberg, and M. Magdon-ismail. Efficient identification of overlapping communities. In IEEE ISI, 2005.
 
4
 
5
 
6
]]A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Phy. Rev. E, 70, 2004.
 
7
 
8
]]D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In NIPS, 2001.
9
 
10
]]E. Erosheva, S. Fienberg, and J. Lafferty. Mixed membership models of scientific publications. PNAS, 101, 2004.
 
11
 
12
 
13
]]A. Gruber, M. Rosen-Zvi, and Y. Weiss. Latent topic models for hypertext. In UAI, 2008.
 
14
]]J. M. Hofman and C. H. Wiggins. A Bayesian approach to network modularity. Phy. Rev. Letters, 100, 2008.
15
 
16
17
 
18
]]S. Lacoste-Julien, F. Sha, and M. I. Jordan. DiscLDA: Discriminative learning for dimensionality reduction and classification. In NIPS, 2008.
 
19
]]A. McCallum and K. Nigam. A comparisoin of event models for naive bayes text classification. AAAI Workshop, 1998.
 
20
21
 
22
]]M. E. J. Newman. Fast algorithm for detecting community structure in networks. Phy. Rev. E, 69, 2004.
 
23
]]M. E. J. Newman. Modularity and community structure in networks. PNAS, 103, 2006.
 
24
]]M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Phy. Rev. E, 69, 2003.
 
25
]]M. E. J. E. Newman and E. A. A. Leicht. Mixture models and exploratory analysis in networks. PNAS, 104, 2007.
 
26
]]A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.
 
27
]]K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. J. of ASA, 96, 2001.
 
28
]]L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. In Technical report, Stanford Digital Library Technologies Project, Stanford University, Stanford, CA, USA, 1998.
 
29
]]G. Palla, I. Derenyi, I. Farkas, and T. Vicsek. Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435, 2005.
 
30
]]W. Ren, G. Yan, X. Liao, and L. Xiao. Simple probabilistic algorithm for detecting community structure. Phy. Rev. E.
 
31
 
32
]]X. Wang, N. Mohanty, and A. McCallum. Group and topic discovery from relations and their attributes. In NIPS, 2005.
 
33
]]K. Yu, S. Yu, and V. Tresp. Soft clustering on graphs. In NIPS, 2005.
 
34
]]S. Yu, B. D. Moor, and Y. Moreau. Clustering by heterogeneous data fusion: framework and applications. NIPS workshop, 2009.
35

Collaborative Colleagues:
Tianbao Yang: colleagues
Rong Jin: colleagues
Yun Chi: colleagues
Shenghuo Zhu: colleagues