|
ABSTRACT
We consider the problem of approximating the PageRank of a target node using only local information provided by a link server. This problem was originally studied by Chen, Gan, and Suel (CIKM 2004), who presented an algorithm for tackling it. We prove that local approximation of PageRank, even to within modest approximation factors, is infeasible in the worst-case, as it requires probing the link server for Ω(n) nodes, where n is the size of the graph. The difficulty emanates from nodes of high in-degree and/or from slow convergence of the PageRank random walk. We show that when the graph has bounded in-degree and admits fast PageRank convergence, then local PageRank approximation can be done using a small number of queries. Unfortunately, natural graphs, such as the web graph, are abundant with high in-degree nodes, making this algorithm (or any other local approximation algorithm) too costly. On the other hand, reverse natural graphs tend to have low in-degree while maintaining fast PageRank convergence. It follows that calculating Reverse PageRank locally is frequently more feasible than computing PageRank locally. We demonstrate that Reverse PageRank is useful for several applications, including computation of hub scores for web pages, finding influencers in social networks, obtaining good seeds for crawling, and measurement of semantic relatedness between concepts in a taxonomy.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
A. L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.
|
| |
3
|
P. Berkhin. A survey on PageRank computing. Internet Mathematics, 2(1):73--120, 2005.
|
| |
4
|
Krishna Bharat , Andrei Broder , Monika Henzinger , Puneet Kumar , Suresh Venkatasubramanian, The connectivity server: fast access to linkage information on the Web, Proceedings of the seventh international conference on World Wide Web 7, p.469-477, April 1998, Brisbane, Australia
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
 |
11
|
Anirban Dasgupta , Arpita Ghosh , Ravi Kumar , Christopher Olston , Sandeep Pandey , Andrew Tomkins, The discoverability of the web, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242630]
|
 |
12
|
|
| |
13
|
|
 |
14
|
Stephen Dill , Ravi Kumar , Kevin S. Mccurley , Sridhar Rajagopalan , D. Sivakumar , Andrew Tomkins, Self-similarity in the web, ACM Transactions on Internet Technology (TOIT), v.2 n.3, p.205-223, August 2002
[doi> 10.1145/572326.572328]
|
| |
15
|
D. Fogaras. Where to start browsing the Web? In IICS, pages 65--79, 2003.
|
| |
16
|
E. Gabrilovich and S. Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In Proc. 20th IJCAI, pages 250--257, 2007.
|
| |
17
|
|
| |
18
|
|
| |
19
|
T. H. Haveliwala and S. D. Kamvar. The second eigenvalue of the Google matrix. Technical report, Stanford University, 2003.
|
| |
20
|
A. Java, P. Kolari, T. Finin, and T. Oates. Modeling the spread of influence on the Blogosphere. Technical report, University of Maryland, Baltimore County, 2006.
|
 |
21
|
|
| |
22
|
S. Kamvar, H. Haveliwala, and G. Golub. Adaptive methods for the computation of PageRank. Linear Algebra and its Applications, 386:51--65, 2004.
|
| |
23
|
C. Kohlschütter, P. A. Chirita, and W. Nejdl. Efficient parallel computation of PageRank. In Proc. 28th ECIR, pages 241--252, 2006.
|
| |
24
|
G. Kollias and E. Gallopoulos. Asynchronous computation of PageRank computation in an interactive multithreading environment. In Web Information Retrieval and Linear Algebra Algorithms, 2007.
|
| |
25
|
R. Kumar , P. Raghavan , S. Rajagopalan , D. Sivakumar , A. Tomkins , E. Upfal, Stochastic models for the Web graph, Proceedings of the 41st Annual Symposium on Foundations of Computer Science, p.57, November 12-14, 2000
|
| |
26
|
|
| |
27
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford Digital Library Technologies Project, 1998.
|
| |
28
|
|
| |
29
|
R. Rada, H. Mili, E. Bicknell, and M. Blettner. Development and application of a metric on semantic nets. IEEE Trans. on Systems, Man and Cybernetics, 19(1):17--30, 1989.
|
 |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
M. Strube and S. P. Ponzetto. WikiRelate! computing semantic relatedness using Wikipedia. In AAAI 2006, pages 1419--1424, 2006.
|
| |
34
|
|
| |
35
|
Y. Wu. Subgraphrank: PageRank approximation for a subgraph or in a decentralized system. VLDB PhD workshop, 2007.
|
|