|
ABSTRACT
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity between nodes in networks. We propose a new way of measuring and extracting proximity in networks called “cycle-free effective conductance” (CFEC). Importantly, the measured proximity is accompanied with a proximity subgraph which allows assessing and understanding measured values. Our proximity calculation can handle more than two endpoints, directed edges, is statistically well behaved, and produces an effectiveness score for the computed subgraphs. We provide an efficient algorithm to measure and extract proximity. Also, we report experimental results and show examples for four large network datasets: a telecommunications calling graph, the IMDB actors graph, an academic coauthorship network, and a movie recommendation system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Barabasi, A.-L. and Albert, R. 1999. Emergence of scaling in random networks. Sci. 286, 509--512.
|
 |
2
|
|
 |
3
|
|
| |
4
|
Bollobas, B. 1998. Modern Graph Theory. Springer.
|
| |
5
|
Brandes, U. and Fleischer, D. 2005. Centrality measures based on current flow. In Proceedings of the 22nd Symposium on Theoretical Aspects of Computer Science (STACS). Lecture Notes in Computer Science, vol. 3404, Springer, 533--544.
|
 |
6
|
|
| |
7
|
|
| |
8
|
dblp. 1998. DBLP computer science bibliography. dblp.uni-trier.de.
|
| |
9
|
Doyle, P. G. and Snell, J. L. 1984. Random Walks and Electrical Networks. Mathematical Association of America. http://arxiv.org/abs/math.PR/0001057.
|
 |
10
|
|
 |
11
|
Gary William Flake , Steve Lawrence , C. Lee Giles, Efficient identification of Web communities, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.150-160, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347121]
|
| |
12
|
|
 |
13
|
David Gibson , Jon Kleinberg , Prabhakar Raghavan, Inferring Web communities from link topology, Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems, p.225-234, June 20-24, 1998, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/276627.276652]
|
| |
14
|
Hadjiconstantinou, E. and Christofides, N. 1999. An efficient implementation of an algorithm for finding k shortest simple paths. Netw. 34, 88--101.
|
| |
15
|
imdb. 2007. Internet Movie Database. www.imdb.com.
|
| |
16
|
John Hershberger, J. M. M. and Suri, S. 2003. Finding the k shortest simple paths: A new algorithm and its implementation. In Proceedings of the 5th Workshop on Algorithm Engineering and Experimentation (ALENEX). SIAM, 26--36.
|
| |
17
|
Katoh, N., Ibaraki, T., and Mine, H. 1982. An efficient algorithm for k shortest simple paths. Netw. 12, 411--427.
|
 |
18
|
|
| |
19
|
Lang, K. 2004. Finding good nearly balanced cuts in power law graphs. Tech. Rep., Yahoo Research Labs.
|
 |
20
|
|
| |
21
|
Myers, C. L., Robson, D., Wible, A., Hibbs, M. A., Chiriac, C., Theesfeld, C. L., Dolinski, K., and Troyanskaya, O. G. 2005. Discovery of biological networks from diverse functional genomic data. Genome Biol. 6, R114.
|
| |
22
|
netflix. 2007. Netflix prize. www.netflixprize.com.
|
| |
23
|
Popescul, A. and Ungar, L. H. 2003. Statistical relational learning for link prediction. In Proceedings of the Workshop on Learning Statistical Models from Relational Data (IJCAI).
|
 |
24
|
|
| |
25
|
Tenenbaum, J. B., de Silva, V., and Langford, J. C. 2000. A global geometric framework for nonlinear dimensionality reduction. Sci. 290, 2319--2323.
|
 |
26
|
|
| |
27
|
Winkler, W. E. 1999. The state of record linkage and current research problems. Tech. Rep., Statistical Research Division, U.S. Bureau of the Census.
|
CITED BY
|
|
Luh Yen , Marco Saerens , Amin Mantrach , Masashi Shimbo, A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|