|
ABSTRACT
The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis (SNA). Previous work in SNA has emphasized the social network (SN) topology measured by communication frequencies while ignoring the semantic information in SNs. In this paper, we propose two generative Bayesian models for semantic community discovery in SNs, combining probabilistic modeling with community detection in SNs. To simulate the generative models, an EnF-Gibbs sampling algorithm is proposed to address the efficiency and performance problems of traditional methods. Experimental studies on Enron email corpus show that our approach successfully detects the communities of individuals and in addition provides semantic topic descriptions of these communities.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Aaron Clauset, et al., Finding community structure in very large networks, Phys. Rev. E 70, 066111, 2004.
|
| |
3
|
Aron Culotta, et al., Extracting social networks and contact information from email and the Web, In First Conference on Email and Anti-Spam, Mountain View, CA, USA. July 2005.
|
 |
4
|
|
 |
5
|
Gary William Flake , Steve Lawrence , C. Lee Giles, Efficient identification of Web communities, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.150-160, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347121]
|
| |
6
|
M. Girvan and M. Newman, Community structure in social and biological networks. In Proceedings of National Academic Science, USA 99, 7821--7826, 2002.
|
| |
7
|
T. Griffiths, Finding scientific topics, In National Academy of Sciences, 5228--5235, 2004.
|
| |
8
|
B. W. Kernighan, An efficient heuristic procedure for partitioning graphs, Bell System Technical Journal, 49, 291--307, 1970.
|
 |
9
|
|
 |
10
|
|
| |
11
|
A. McCallum, Multi-label text classification with a mixture model trained by EM, In AAAI Workshop on Text Learning, 1999.
|
| |
12
|
A. McCallum, et al., The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks: Experiments with Enron and Academic Email, Technical Report, Computer Science, University of Massachusetts Amherst, 2004.
|
| |
13
|
Mark Newman, Fast algorithm for detecting community structure in networks, Phys. Rev., E, 2004.
|
| |
14
|
Mark Newman, Detecting community structure in networks, Eur. Phys. 38, 321--330, 2004.
|
 |
15
|
Mike Perkowitz , Matthai Philipose , Kenneth Fishkin , Donald J. Patterson, Mining models of human activities from the web, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988750]
|
| |
16
|
W. M. Rand, Objective criteria for the evaluation of clustering methods, Journal of American Statistical Association, 66:846--850, 1971.
|
 |
17
|
|
| |
18
|
|
| |
19
|
J. Scott, Social Network Analysis: A Handbook, Sage, London, 2nd edition, 2000.
|
| |
20
|
Jitesh Shetty, et al., The Enron Email Dataset Database Schema and Brief Statistical Report, Information Sciences Institute, 2004.
|
 |
21
|
Mark Steyvers , Padhraic Smyth , Michal Rosen-Zvi , Thomas Griffiths, Probabilistic author-topic models for information discovery, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014087]
|
| |
22
|
|
| |
23
|
Stanley Wasserman and Katherine Faust, Social Network Analysis: Methods and Applications, Cambridge University Press, 1994.
|
 |
24
|
|
 |
25
|
|
 |
26
|
|
| |
27
|
|
CITED BY 12
|
|
Ding Zhou , Xiang Ji , Hongyuan Zha , C. Lee Giles, Topic evolution and social interactions: how authors effect research, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
Tim Berners-Lee , Wendy Hall , James A. Hendler , Kieron O'Hara , Nigel Shadbolt , Daniel J. Weitzner, A framework for web science, Foundations and Trends in Web Science, v.1 n.1, p.1-130, January 2006
|
|
|
Ding Zhou , Shenghuo Zhu , Kai Yu , Xiaodan Song , Belle L. Tseng , Hongyuan Zha , C. Lee Giles, Learning multiple graphs for document recommendations, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
Ding Zhou , Jiang Bian , Shuyi Zheng , Hongyuan Zha , C. Lee Giles, Exploring social annotations for information retrieval, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
Lei Tang , Huan Liu , Jianping Zhang , Zohreh Nazeri, Community evolution in dynamic multi-mode networks, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
Christopher P. Diehl , Galileo Namata , Lise Getoor, Relationship identification for social network discovery, Proceedings of the 22nd national conference on Artificial intelligence, p.546-552, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
Haizheng Zhang , C. Lee Giles , Henry C. Foley , John Yen, Probabilistic community discovery using hierarchical latent Gaussian mixture model, Proceedings of the 22nd national conference on Artificial intelligence, p.663-668, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|