|
ABSTRACT
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its members than to the remainder of the network. In this paper, we characterize as a function of size the statistical and structural properties of such sets of nodes. We define the network community profile plot, which characterizes the "best" possible community - according to the conductance measure - over a wide range of size scales, and we study over 70 large sparse real-world networks taken from a wide range of application domains. Our results suggest a significantly more refined picture of community structure in large real-world networks than has been appreciated previously. Our most striking finding is that in nearly every network dataset we examined, we observe tight but almost trivial communities at very small scales, and at larger size scales, the best possible communities gradually "blend in" with the rest of the network and thus become less "community-like." This behavior is not explained, even at a qualitative level, by any of the commonly-used network generation models. Moreover, this behavior is exactly the opposite of what one would expect based on experience with and intuition from expander graphs, from graphs that are well-embeddable in a low-dimensional structure, and from small social networks that have served as testbeds of community detection algorithms. We have found, however, that a generative model, in which new edges are added via an iterative "forest fire" burning process, is able to produce graphs exhibiting a network community structure similar to our observations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Z. Albert and A.-L. Barabási. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.
|
| |
2
|
Christopher Allen. Life with alacrity: The Dunbar number as a limit to group sizes, http://www.lifewithalacrity.com/2004/03/the_dunbar_numb.html, 2004.
|
| |
3
|
|
 |
4
|
Sanjeev Arora , Satish Rao , Umesh Vazirani, Expander flows, geometric embeddings and graph partitioning, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, p.222-231, June 13-16, 2004, Chicago, IL, USA
[doi> 10.1145/1007352.1007355]
|
 |
5
|
Lars Backstrom , Dan Huttenlocher , Jon Kleinberg , Xiangyang Lan, Group formation in large social networks: membership, growth, and evolution, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150412]
|
| |
6
|
F. R. K. Chung. Spectral graph theory, volume 92 of CBMS Regional Conference Series in Mathematics. AMS, 1997.
|
| |
7
|
|
| |
8
|
A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. arXiv:cond-mat/0408187, August 2004.
|
| |
9
|
L. Danon, J. Duch, A. Diaz-Guilera, and A. Arenas. Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 29(09):P09008, 2005.
|
| |
10
|
|
| |
11
|
Robin Dunbar. Grooming, Gossip, and the Evolution of Language. Harvard Univ Press, October 1998.
|
| |
12
|
A. D. Flaxman, A. M. Frieze, and J. Vera. A geometric preferential attachment model of networks. In WAW '04: Proceedings of the 3rd Workshop On Algorithms And Models For The Web-Graph, pages 44--55, 2004.
|
| |
13
|
M. Gaertler. Clustering. In U. Brandes and T. Erlebach, editors, Network Analysis: Methodological Foundations, pages 178--215. Springer, 2005.
|
 |
14
|
|
| |
15
|
S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bulletin of the American Mathematical Society, 43:439--561, 2006.
|
 |
16
|
|
| |
17
|
|
| |
18
|
R. Kumar , P. Raghavan , S. Rajagopalan , D. Sivakumar , A. Tomkins , E. Upfal, Stochastic models for the Web graph, Proceedings of the 41st Annual Symposium on Foundations of Computer Science, p.57, November 12-14, 2000
|
| |
19
|
K. Lang and S. Rao. A flow-based method for improving the expansion or conductance of graph cuts. In IPCO '04: Proceedings of the 10th International Conf. on Integer Programming and Combinatorial Optimization, 2004.
|
 |
20
|
|
 |
21
|
Jure Leskovec , Jon Kleinberg , Christos Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081893]
|
 |
22
|
|
| |
23
|
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. Manuscript.
|
| |
24
|
M. E. J. Newman. Detecting community structure in networks. The European Physical J. B, 38:321--330, 2004.
|
| |
25
|
M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E, 74, 2006.
|
| |
26
|
M. E. J. Newman. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103(23):8577--8582, 2006.
|
| |
27
|
E. Ravasz and A.-L. Barabási. Hierarchical organization in complex networks. Physical Review E, 67:026112, 2003.
|
| |
28
|
M. Richardson, R. Agrawal, and P. Domingos. Trust management for the semantic web. In ISWC ?03: Proceedings of the 2nd International Semantic Web Conference, pages 351--368, 2003.
|
| |
29
|
|
| |
30
|
S. E. Schaeffer. Graph clustering. Computer Science Review, 1(1):27--64, 2007.
|
| |
31
|
|
| |
32
|
|
| |
33
|
S. L. Tauro, C. Palmer, G. Siganos, and M. Faloutsos. A simple conceptual model for the internet topology. In GLOBECOM ?01: Global Telecommunications Conference, pages 1667--1671, 2001.
|
| |
34
|
D. J. Watts and S. H. Strogatz. Collective dynamics of small-world networks. Nature, 393:440--442, 1998.
|
| |
35
|
W. W. Zachary. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33:452--473, 1977.
|
| |
36
|
|
| |
37
|
|
CITED BY 11
|
|
|
|
|
Qiankun Zhao , Sourav S. Bhowmick , Xin Zheng , Kai Yi, Characterizing and predicting community members from evolutionary and heterogeneous networks, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Kensuke Onuma , Hanghang Tong , Christos Faloutsos, TANGENT: a novel, 'Surprise me', recommendation algorithm, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|
|
|
|
|
Nguyen Tran , Bonan Min , Jinyang Li , Lakshminarayanan Subramanian, Sybil-resilient online content voting, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.15-28, April 22-24, 2009, Boston, Massachusetts
|
|
|
|
|
|
Lei Guo , Enhua Tan , Songqing Chen , Xiaodong Zhang , Yihong (Eric) Zhao, Analyzing patterns of user content generation in online social networks, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|
|
|
|
|
|
|