|
ABSTRACT
We define a connection subgraph as a small subgraph of a large graph that best captures the relationship between two nodes. The primary motivation for this work is to provide a paradigm for exploration and knowledge discovery in large social networks graphs. We present a formal definition of this problem, and an ideal solution based on electricity analogues. We then show how to accelerate the computations, to produce approximate, but high-quality connection subgraphs in real time on very large (disk resident) graphs.We describe our operational prototype, and we demonstrate results on a social network graph derived from the World Wide Web. Our graph contains 15 million nodes and 96 million edges, and our system still produces quality responses within seconds.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Albert, H. Jeong, and A.-L. Barabasi. Diameter of the world wide web. Nature, (401):130--131, 1999.
|
| |
2
|
U. Brandes, M. Gaertler, and D. Wagner. Experiments on graph clustering algorithms. In Proc. 11th Europ. Symp. Algorithms (ESA '03), LNCS 2832, pages 568--579. Springer-Verlag, 2003.
|
 |
3
|
A. K. Chandra , P. Raghavan , W. L. Ruzzo , R. Smolensky, The electrical resistance of a graph captures its commute and cover times, Proceedings of the twenty-first annual ACM symposium on Theory of computing, p.574-586, May 14-17, 1989, Seattle, Washington, United States
[doi> 10.1145/73007.73062]
|
 |
4
|
|
 |
5
|
|
| |
6
|
S. Dorogovtsev and J. Mendes. Evolution of networks. Advances in Physics, 51:1079--1187, 2002.
|
| |
7
|
P. Doyle and J. Snell. Random walks and electric networks, volume 22. Mathematical Association America, New York, 1984.
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
David Gibson , Jon Kleinberg , Prabhakar Raghavan, Inferring Web communities from link topology, Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems, p.225-234, June 20-24, 1998, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/276627.276652]
|
| |
12
|
M. Girvan and M. E. J. Newman. Community structure is social and biological networks.
|
| |
13
|
M. Grotschel, C. L. Monma, and M. Stoer. Design of survivable networks. In Handbooks in Operations Research and Management Science 7: Network Models. North Holland, 1993.
|
| |
14
|
D. Gruhl , L. Chavet , D. Gibson , J. Meyer , P. Pattanayak , A. Tomkins , J. Zien, How to build a WebFountain: An architecture for very large-scale text analytics, IBM Systems Journal, v.43 n.1, p.64-77, January 2004
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
|
| |
20
|
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167--256, 2003.
|
| |
21
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
|
| |
22
|
C. R. Palmer and C. Faloutsos. Electricity based external similarity of categorical attributes. Seoul, South Korea, April-May 2003.
|
| |
23
|
S. van Dongen. Graph Clustering by Flow Simulation. Ph.D. thesis, University of Utrecht, May 2000.
|
| |
24
|
|
CITED BY 33
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yutaka Matsuo , Junichiro Mori , Masahiro Hamasaki , Keisuke Ishida , Takuichi Nishimura , Hideaki Takeda , Koiti Hasida , Mitsuru Ishizuka, POLYPHONET: an advanced social network extraction system from the web, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
José F. Rodrigues, Jr. , Hanghang Tong , Agma J. M. Traina , Christos Faloutsos , Jure Leskovec, GMine: a system for scalable, interactive graph visualization and mining, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andy Yoo , Edmond Chow , Keith Henderson , William McLendon , Bruce Hendrickson , Umit Catalyurek, A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L, Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p.25, November 12-18, 2005
|
|
|
|
|
|
|
|
|
Yutaka Matsuo , Junichiro Mori , Masahiro Hamasaki , Takuichi Nishimura , Hideaki Takeda , Koiti Hasida , Mitsuru Ishizuka, POLYPHONET: An advanced social network extraction system from the Web, Web Semantics: Science, Services and Agents on the World Wide Web, v.5 n.4, p.262-278, December, 2007
|
|
|
|
|
|
|
|
|
Hanghang Tong , Christos Faloutsos , Brian Gallagher , Tina Eliassi-Rad, Fast best-effort pattern matching in large attributed graphs, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
Hanghang Tong , Yasushi Sakurai , Tina Eliassi-Rad , Christos Faloutsos, Fast mining of complex time-stamped events, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|