|
ABSTRACT
PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor α that spreads uniformly part of the rank. The choice of α is eminently empirical, and in most cases the original suggestion α = 0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in α was discovered to be useful in link-spam detection[21]. Moreover, an analytical justification of the value chosen for α is still missing. In this paper, we give the first mathematical analysis of PageRank when α changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of α close to 1 do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and an extension of the Power Method that approximates them with convergence O (tk αt) for the k-th derivative. Finally, we show a tight connection between iterated computation and analytical behaviour by proving that the k-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
Soumen Chakrabarti, Byron Dom, David Gibson, Jon Kleinberg, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew Tomkins. Hypersearching the web. Scientific American, June 1999.
|
| |
4
|
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
|
 |
5
|
|
| |
6
|
Gene H. Golub and Chen Greif. Arnoldi-type algorithms for computing stationary distribution vectors, with application to PageRank. Technical Report SCCM-04-15, Stanford University Technical Report, 2004.
|
| |
7
|
Taher Haveliwala. Efficient computation of PageRank. Technical report, Stanford University Technical Report, October 1999.
|
| |
8
|
Taher Haveliwala and Sepandar Kamvar. The condition number of the PageRank problem. Technical Report~36, Stanford University Technical Report, June 2003.
|
| |
9
|
Taher Haveliwala and Sepandar Kamvar. The second eigenvalue of the Google matrix. Technical Report~20, Stanford University Technical Report, March 2003.
|
| |
10
|
Sepandar Kamvar, Taher Haveliwala, Christopher Manning, and Gene Golub. Exploiting the block structure of the web for computing PageRank. Technical Report 17, Stanford University Technical Report, March 2003.
|
 |
11
|
|
 |
12
|
|
 |
13
|
Ravi Kumar , Prabhakar Raghavan , Sridhar Rajagopalan , D. Sivakumar , Andrew Tompkins , Eli Upfal, The Web as a graph, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.1-10, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335170]
|
| |
14
|
Amy N. Langville and Carl D. Meyer. Deeper inside PageRank. Internet Mathematics, 1(3):355--400, 2004.
|
| |
15
|
Chris Pan-Chi Lee, Gene H. Golub, and Stefanos A. Zenios. A fast two-stage algorithm for computing PageRank and its extensions. Technical report, Stanford University Technical Report, 2004.
|
 |
16
|
|
| |
17
|
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, Stanford University, Stanford, CA, USA, 1998.
|
| |
18
|
|
| |
19
|
Luca Pretto. A theoretical approach to link analysis algorithms, 2002. PhD Thesis.
|
| |
20
|
Eugene Seneta. Non-Negative Matrices and Markov Chains. Springer Series in Statistics. Springer-Verlag, 1981.
|
| |
21
|
Hui Zhang, Ashish Goel, Ramesh Govindan, Kahn Mason, and Benjamin~Van Roy. Making eigenvector-based reputation systems robust to collusion. In Stefano Leonardi, editor, Proceedings WAW 2004, number 3243 in LNCS, pages 92--104. Springer-Verlag, 2004.
|
CITED BY 10
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mark R. Meiss , Filippo Menczer , Santo Fortunato , Alessandro Flammini , Alessandro Vespignani, Ranking web sites with real user traffic, Proceedings of the international conference on Web search and web data mining, February 11-12, 2008, Palo Alto, California, USA
|
|
|
|
|
|
Yuting Liu , Bin Gao , Tie-Yan Liu , Ying Zhang , Zhiming Ma , Shuyuan He , Hang Li, BrowseRank: letting web users vote for page importance, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
Lei Yang , Lei Qi , Yan-Ping Zhao , Bin Gao , Tie-Yan Liu, Link analysis using time series of web graphs, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
|