|
ABSTRACT
Graphs are an increasingly important data source, with such important graphs as the Internet and the Web. Other familiar graphs include CAD circuits, phone records, gene sequences, city streets, social networks and academic citations. Any kind of relationship, such as actors appearing in movies, can be represented as a graph. This work presents a data mining tool, called ANF, that can quickly answer a number of interesting questions on graph-represented data, such as the following. How robust is the Internet to failures? What are the most influential database papers? Are there gender differences in movie appearance patterns? At its core, ANF is based on a fast and memory-efficient approach for approximating the complete "neighbourhood function" for a graph. For the Internet graph (268K nodes), ANF's highly-accurate approximation is more than 700 times faster than the exact computation. This reduces the running time from nearly a day to a matter of a minute or two, allowing users to perform ad hoc drill-down tasks and to repeatedly answer questions about changing data sources. To enable this drill-down, ANF employs new techniques for approximating neighbourhood-type functions for graphs with distinguished nodes and/or edges. When compared to the best existing approximation, ANF's approach is both faster and more accurate, given the same resources. Additionally, unlike previous approaches, ANF scales gracefully to handle disk resident graphs. Finally, we present some of our results from mining large graphs using ANF.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Andrei Broder , Ravi Kumar , Farzin Maghoul , Prabhakar Raghavan , Sridhar Rajagopalan , Raymie Stata , Andrew Tomkins , Janet Wiener, Graph structure in the Web, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.309-320, June 2000, Amsterdam, The Netherlands
|
| |
4
|
|
| |
5
|
|
| |
6
|
CORA search engine. http://www.cora.whizbang.com.
|
 |
7
|
|
 |
8
|
Michalis Faloutsos , Petros Faloutsos , Christos Faloutsos, On power-law relationships of the Internet topology, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.251-262, August 30-September 03, 1999, Cambridge, Massachusetts, United States
|
| |
9
|
|
| |
10
|
IMDB. http://www.imdb.com.
|
| |
11
|
|
| |
12
|
|
 |
13
|
Ravi Kumar , Prabhakar Raghavan , Sridhar Rajagopalan , D. Sivakumar , Andrew Tompkins , Eli Upfal, The Web as a graph, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.1-10, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335170]
|
| |
14
|
|
 |
15
|
Mark H. Nodine , Michael T. Goodrich , Jeffrey Scott Vitter, Blocking for external graph searching, Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.222-232, May 25-28, 1993, Washington, D.C., United States
[doi> 10.1145/153850.153880]
|
| |
16
|
C. R. Palmer, G. Siganos, M. Faloutsos, C. Faloutsos, and P. Gibbons. The connectivity and fault-tolerance of the Internet topology. In Workshop on Network-Related Data Management (NRDM-2001), 2001.
|
| |
17
|
C. R. Palmer and J. G. Steffan. Generating network toplogies that obey power laws. In IEEE Globecom 2000, 2000.
|
| |
18
|
|
| |
19
|
|
| |
20
|
S. L. Tauro, C. Palmer, G. Siganos, and M. Faloutsos. A simple conceptual model for the Internet topology. In IEEE Globecom 2001, 2001.
|
CITED BY 37
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Suman Nath , Phillip B. Gibbons , Srinivasan Seshan , Zachary R. Anderson, Synopsis diffusion for robust aggregation in sensor networks, Proceedings of the 2nd international conference on Embedded networked sensor systems, November 03-05, 2004, Baltimore, MD, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jure Leskovec , Jon Kleinberg , Christos Faloutsos, Graphs over time: densification laws, shrinking diameters and possible explanations, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Louis Licamele , Mustafa Bilgic , Lise Getoor , Nick Roussopoulos, Capital and benefit in social networks, Proceedings of the 3rd international workshop on Link discovery, p.44-51, August 21-25, 2005, Chicago, Illinois
|
|
|
Boon Thau Loo , Tyson Condie , Minos Garofalakis , David E. Gay , Joseph M. Hellerstein , Petros Maniatis , Raghu Ramakrishnan , Timothy Roscoe , Ion Stoica, Declarative networking: language, execution and optimization, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
Tamás Sarlós , Adrás A. Benczúr , Károly Csalogány , Dániel Fogaras , Balázs Rácz, To randomize or not to randomize: space optimal summaries for hyperlink analysis, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
Amit A. Nanavati , Siva Gurumurthy , Gautam Das , Dipanjan Chakraborty , Koustuv Dasgupta , Sougata Mukherjea , Anupam Joshi, On the structural properties of massive telecom call graphs: findings and implications, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
Yong-Yeol Ahn , Seungyeop Han , Haewoon Kwak , Sue Moon , Hawoong Jeong, Analysis of topological characteristics of huge online social networking services, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
Debora Donato , Mario Paniccia , Maddalena Selis , Carlos Castillo , Giovanni Cortese , Stefano Leonardi, New metrics for reputation management in P2P networks, Proceedings of the 3rd international workshop on Adversarial information retrieval on the web, May 08-08, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Vaishnavi Krishnamurthy , Michalis Faloutsos , Marek Chrobak , Jun-Hong Cui , Li Lao , Allon G. Percus, Sampling large Internet topologies for simulation purposes, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.51 n.15, p.4284-4302, October, 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yu Chen , Tracy Chen , Ming Chen , Zheng Zhang, Islands in the MSN messenger buddy network, Proceedings of the 1st workshop on Social network systems, p.13-18, April 01-01, 2008, Glasgow, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|