ACM Home Page
Please provide us with feedback. Feedback
Local methods for estimating pagerank values
Full text PdfPdf (244 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the thirteenth ACM international conference on Information and knowledge management table of contents
Washington, D.C., USA
SESSION: DB-IR-2 (databases and information retieval): web and XML text search table of contents
Pages: 381 - 389  
Year of Publication: 2004
ISBN:1-58113-874-1
Authors
Yen-Yu Chen  Polytechnic University, Brooklyn, NY
Qingqing Gan  Polytechnic University, Brooklyn, NY
Torsten Suel  Polytechnic University, Brooklyn, NY
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 85,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1031171.1031248
What is a DOI?

ABSTRACT

The Google search engine uses a method called PageRank, together with term-based and other ranking techniques, to order search results returned to the user. PageRank uses link analysis to assign a global importance score to each web page. The PageRank scores of all the pages are usually determined off-line in a large-scale computation on the entire hyperlink graph of the web, and several recent studies have focused on improving the efficiency of this computation, which may require multiple hours on a workstation.

However, in some scenarios, such as online analysis of link evolution and mining of large web archives such as the Internet Archive, it may be desirable to quickly approximate or update the PageRanks of individual nodes without performing a large-scale computation on the entire graph. We address this problem by studying several methods for efficiently estimating the PageRank score of a particular web page using only a small subgraph of the entire web. In our model, we assume that the graph is accessible remotely via a link database (such as the AltaVista Connectivity Server) or is stored in a relational database that performs lookups on disks to retrieve node and connectivity information. We show that a reasonable estimate of the PageRank value of a node is possible in most cases by retrieving only a moderate number of nodes in the local neighborhood of the node.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
A. Arasu, J. Cho, H. Garcia-Molina, and S. Raghavan. Searching the web. ACM Transactions on Internet Technologies, 1(1), June 2001.
 
3
A. Arasu, J. Novak, Tomkins A, and J. Tomlin. Pagerank computation and the structure of the web: Experiments and algorithms. In Poster presentation at the 11th Int. World Wide Web Conference, May 2002.
 
4
 
5
6
 
7
 
8
 
9
10
 
11
S. Chien, C. Dwork, R. Kumar, D. Simon, and D. Sivakumar. Link evolution: Analysis and algorithms. In Workshop on Algorithms and Models for the Web Graph, 2002.
 
12
 
13
T.H. Haveliwala. Efficient computation of pagerank. Technical report, Stanford University, October 1999. Available at tt http://dbpubs.stanford.edu:8090/pub/1999-31.
14
 
15
16
17
18
 
19
20
21
 
22
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Computer Science Department, Stanford University, 1999.
 
23
 
24
M. Richardson and P. Domingos. The intelligent surfer: Probabilistic combination of link and content information in pagerank. In Advances in Neural Information Processing Systems, 2002.
 
25

CITED BY  8

Collaborative Colleagues:
Yen-Yu Chen: colleagues
Qingqing Gan: colleagues
Torsten Suel: colleagues