|
ABSTRACT
The blogosphere has unique structural and temporal properties since blogs are typically used as communication media among human individuals. In this paper, we propose a novel technique that captures the structure and temporal dynamics of blog communities. In our framework, a community is a set of blogs that communicate with each other triggered by some events (such as a news article). The community is represented by its structure and temporal dynamics: a community graph indicates how often one blog communicates with another, and a community intensity indicates the activity level of the community that varies over time. Our method, community factorization, extracts such communities from the blogosphere, where the communication among blogs is observed as a set of subgraphs (i.e., threads of discussion). This community extraction is formulated as a factorization problem in the framework of constrained optimization, in which the objective is to best explain the observed interactions in the blogosphere over time. We further provide a scalable algorithm for computing solutions to the constrained optimization problems. Extensive experimental studies on both synthetic and real blog data demonstrate that our technique is able to discover meaningful communities that are not detectable by traditional methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Lars Backstrom , Dan Huttenlocher , Jon Kleinberg , Xiangyang Lan, Group formation in large social networks: membership, growth, and evolution, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150412]
|
 |
2
|
|
 |
3
|
|
| |
4
|
A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
|
| |
5
|
|
| |
6
|
C. Ding, T. Li, and M. Jordan. Convex and semi-nonnegative matrix factorizations for clustering and low-dimension representation. Technical Report LBNL-60428, Lawrence Berkeley National Laboratory, 2006.
|
 |
7
|
Gary William Flake , Steve Lawrence , C. Lee Giles, Efficient identification of Web communities, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.150-160, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347121]
|
| |
8
|
G. Golub and C. V. Loan. Matrix Computations. Johns Hopkins University Press, third edition, 1996.
|
 |
9
|
Daniel Gruhl , R. Guha , David Liben-Nowell , Andrew Tomkins, Information diffusion through blogspace, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988739]
|
| |
10
|
D. A. Harville. Matrix Algebra From a Statistician's Perspective. Springer, first edition, 2000.
|
| |
11
|
T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, first edition, 2003.
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401, 1999.
|
 |
16
|
Jure Leskovec , Jon Kleinberg , Christos Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081893]
|
 |
17
|
|
| |
18
|
M. E. J. Newman. Modularity and community structure in networks. Proc. Natl. Acad. Sci., 2006.
|
 |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
G. Wahba. Spline Models for Observational Data. Society for Industrial and Applied Mathematics, 1990.
|
 |
23
|
|
| |
24
|
S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In SDM, 2005.
|
 |
25
|
|
 |
26
|
|
CITED BY 6
|
|
A. Scherrer , P. Borgnat , E. Fleury , J. -L. Guillaume , C. Robardet, Description and simulation of dynamic mobility networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.52 n.15, p.2842-2858, October, 2008
|
|
|
Dong Zhou , Mark Truran , Tim Brailsford , Helen Ashman , Amir Pourabdollah, Llama-b: automatic hyperlink authoring in the blogosphere, Proceedings of the nineteenth ACM conference on Hypertext and hypermedia, June 19-21, 2008, Pittsburgh, PA, USA
|
|
|
Huajing Li , Zaiqing Nie , Wang-Chien Lee , Lee Giles , Ji-Rong Wen, Scalable community discovery on textual data with relations, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|