ACM Home Page
Please provide us with feedback. Feedback
Page quality: in search of an unbiased web ranking
Full text PdfPdf (399 KB)
Source International Conference on Management of Data archive
Proceedings of the 2005 ACM SIGMOD international conference on Management of data table of contents
Baltimore, Maryland
SESSION: Research papers: web table of contents
Pages: 551 - 562  
Year of Publication: 2005
ISBN:1-59593-060-4
Authors
Junghoo Cho  UCLA Computer Science
Sourashis Roy  UCLA Computer Science
Robert E. Adams  UCLA Computer Science
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 143,   Citation Count: 12
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1066157.1066220
What is a DOI?

ABSTRACT

In a number of recent studies [4, 8] researchers have found that because search engines repeatedly return currently popular pages at the top of search results, popular pages tend to get even more popular, while unpopular pages get ignored by an average user. This "rich-get-richer" phenomenon is particularly problematic for new and high-quality pages because they may never get a chance to get users' attention, decreasing the overall quality of search results in the long run. In this paper, we propose a new ranking function, called page quality that can alleviate the problem of popularity-based ranking. We first present a formal framework to study the search engine bias by discussing what is an "ideal" way to measure the intrinsic quality of a page. We then compare how PageRank, the current ranking metric used by major search engines, differs from this ideal quality metric. This framework will help us investigate the search engine bias in more concrete terms and provide clear understanding why PageRank is effective in many cases and exactly when it is problematic. We then propose a practical way to estimate the intrinsic page quality to avoid the inherent bias of PageRank. We derive our proposed quality estimator through a careful analysis of a reasonable web user model, and we present experimental results that show the potential of our proposed estimator. We believe that our quality estimator has the potential to alleviate the rich-get-richer phenomenon and help new and high-quality pages get the attention that they deserve.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
R. Albert, A.-L. Barabasi, and H. Jeong. Diameter of the World Wide Web. Nature, 401(6749):130--131, September 1999.
 
4
 
5
A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: authority-based keyword search in databases. In Proceedings of the International Conference on Very Large Databases (VLDB), August 2004.
 
6
A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, October 1999.
 
7
8
 
9
J. Cho, S. Roy, and R. E. Adams. Page quality: In search of an unbiased web ranking. Technical report, UCLA Computer Science, 2005.
 
10
 
11
F. Geerts, H. Mannila, and E. Terzi. Relational link-based ranking. In Proceedings of the International Conference on Very Large Databases (VLDB), August 2004.
 
12
 
13
14
 
15
S. Kamvar, T. Haveliwala, and G. Golub. Adaptive methods for the computation of pagerank. In Proceedings of International Conference on the Numerical Solution of Markov Chains. September 2003.
16
17
 
18
S. Mizzaro. Measuring the agreement among relevance judges. In Proceedings of MIRA Conference, April 1999.
 
19
Nielsen NetRatings. http://www.nielsen-netratings.com/.
 
20
Npd search and portal site study. Available at http: / /www.npd.com/press/releases/press_000919.htm.
 
21
S. Olsen. Does search engine's power threaten web's independence? Available at http://news.com.com/2009--1023-963618.html, October 2002.
 
22
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University Database Group, 1998. Available at http://dbpubs.stanford.edu:8090/pub/1999--66.
 
23
D. M. Pennock, G. W. Flake, S. Lawrence, E. J. Glover, and C. L. Giles. Winners don't take all: Characterizing the competition for links on the web. Proceedings of the National Academy of Sciences, 99(8):5207--5211, 2002.
 
24
S. E. Robertson and K. Sparck-Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3):129--146, 1975.
 
25
 
26
27
 
28
TREC: Text retrieval conference. http://trec.nist.gov.
29
 
30
 
31
Y. Wang and D. DeWitt. Computing pagerank in a distributed internet search system. In Proceedings of the International Conference on Very Large Databases (VLDB), August 2004.
 
32

CITED BY  12
Collaborative Colleagues:
Junghoo Cho: colleagues
Sourashis Roy: colleagues
Robert E. Adams: colleagues