|
ABSTRACT
In a number of recent studies [4, 8] researchers have found that because search engines repeatedly return currently popular pages at the top of search results, popular pages tend to get even more popular, while unpopular pages get ignored by an average user. This "rich-get-richer" phenomenon is particularly problematic for new and high-quality pages because they may never get a chance to get users' attention, decreasing the overall quality of search results in the long run. In this paper, we propose a new ranking function, called page quality that can alleviate the problem of popularity-based ranking. We first present a formal framework to study the search engine bias by discussing what is an "ideal" way to measure the intrinsic quality of a page. We then compare how PageRank, the current ranking metric used by major search engines, differs from this ideal quality metric. This framework will help us investigate the search engine bias in more concrete terms and provide clear understanding why PageRank is effective in many cases and exactly when it is problematic. We then propose a practical way to estimate the intrinsic page quality to avoid the inherent bias of PageRank. We derive our proposed quality estimator through a careful analysis of a reasonable web user model, and we present experimental results that show the potential of our proposed estimator. We believe that our quality estimator has the potential to alleviate the rich-get-richer phenomenon and help new and high-quality pages get the attention that they deserve.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
R. Albert, A.-L. Barabasi, and H. Jeong. Diameter of the World Wide Web. Nature, 401(6749):130--131, September 1999.
|
| |
4
|
|
| |
5
|
A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: authority-based keyword search in databases. In Proceedings of the International Conference on Very Large Databases (VLDB), August 2004.
|
| |
6
|
A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, October 1999.
|
| |
7
|
Andrei Broder , Ravi Kumar , Farzin Maghoul , Prabhakar Raghavan , Sridhar Rajagopalan , Raymie Stata , Andrew Tomkins , Janet Wiener, Graph structure in the Web, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.309-320, June 2000, Amsterdam, The Netherlands
|
 |
8
|
|
| |
9
|
J. Cho, S. Roy, and R. E. Adams. Page quality: In search of an unbiased web ranking. Technical report, UCLA Computer Science, 2005.
|
| |
10
|
|
| |
11
|
F. Geerts, H. Mannila, and E. Terzi. Relational link-based ranking. In Proceedings of the International Conference on Very Large Databases (VLDB), August 2004.
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
S. Kamvar, T. Haveliwala, and G. Golub. Adaptive methods for the computation of pagerank. In Proceedings of International Conference on the Numerical Solution of Markov Chains. September 2003.
|
 |
16
|
|
 |
17
|
|
| |
18
|
S. Mizzaro. Measuring the agreement among relevance judges. In Proceedings of MIRA Conference, April 1999.
|
| |
19
|
Nielsen NetRatings. http://www.nielsen-netratings.com/.
|
| |
20
|
Npd search and portal site study. Available at http: / /www.npd.com/press/releases/press_000919.htm.
|
| |
21
|
S. Olsen. Does search engine's power threaten web's independence? Available at http://news.com.com/2009--1023-963618.html, October 2002.
|
| |
22
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University Database Group, 1998. Available at http://dbpubs.stanford.edu:8090/pub/1999--66.
|
| |
23
|
D. M. Pennock, G. W. Flake, S. Lawrence, E. J. Glover, and C. L. Giles. Winners don't take all: Characterizing the competition for links on the web. Proceedings of the National Academy of Sciences, 99(8):5207--5211, 2002.
|
| |
24
|
S. E. Robertson and K. Sparck-Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3):129--146, 1975.
|
| |
25
|
|
| |
26
|
|
 |
27
|
|
| |
28
|
TREC: Text retrieval conference. http://trec.nist.gov.
|
 |
29
|
Ah Chung Tsoi , Gianni Morini , Franco Scarselli , Markus Hagenbuchner , Marco Maggini, Adaptive ranking of web pages, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
[doi> 10.1145/775152.775203]
|
| |
30
|
|
| |
31
|
Y. Wang and D. DeWitt. Computing pagerank in a distributed internet search system. In Proceedings of the International Conference on Very Large Databases (VLDB), August 2004.
|
| |
32
|
|
CITED BY 12
|
|
|
|
|
Sandeep Pandey , Sourashis Roy , Christopher Olston , Junghoo Cho , Soumen Chakrabarti, Shuffling a stacked deck: the case for partially randomized ranking of search engine results, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yusuke Yanbe , Adam Jatowt , Satoshi Nakamura , Katsumi Tanaka, Can social bookmarking enhance search in the web?, Proceedings of the 2007 conference on Digital libraries, June 18-23, 2007, Vancouver, BC, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|