|
ABSTRACT
The Open Directory Project is clearly one of the largest collaborative efforts to manually annotate web pages. This effort involves over 65,000 editors and resulted in metadata specifying topic and importance for more than 4 million web pages. Still, given that this number is just about 0.05 percent of the Web pages indexed by Google, is this effort enough to make a difference? In this paper we discuss how these metadata can be exploited to achieve high quality personalized web search. First, we address this by introducing an additional criterion for web page ranking, namely the distance between a user profile defined using ODP topics and the sets of ODP topics covered by each URL returned in regular web search. We empirically show that this enhancement yields better results than current web search using Google. Then, in the second part of the paper, we investigate the boundaries of biasing PageRank on subtopics of the ODP in order to automatically extend these metadata to the whole web.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Bortz. Statistics for Social Scientists. Springer Verlag, 1993.
|
| |
2
|
S. Brin, R. Motwani, L. Page, and T. Winograd. What can you do with a web in your pocket? Data Engineering Bulletin, 21(2):37--47, 1998.
|
| |
3
|
|
| |
4
|
P.-A. Chirita, D. Olmedilla, and W. Nejdl. Pros: A personalized ranking platform for web search. In Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems, Aug 2004.
|
 |
5
|
Chris Ding , Xiaofeng He , Parry Husbands , Hongyuan Zha , Horst D. Simon, PageRank, HITS and a unified framework for link analysis, Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, August 11-15, 2002, Tampere, Finland
[doi> 10.1145/564376.564440]
|
 |
6
|
Cynthia Dwork , Ravi Kumar , Moni Naor , D. Sivakumar, Rank aggregation methods for the Web, Proceedings of the 10th international conference on World Wide Web, p.613-622, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372165]
|
| |
7
|
M. Ester, H.-P. Kriegel, and M. Schubert. Accurate and efficient crawling for relevant websites. In Proceedings of the 30th International VLDB Conference, 2004.
|
| |
8
|
Google search api. http://api.google.com.
|
| |
9
|
Google search engine. http://www.google.com.
|
| |
10
|
Z. Gý'ongyi, H. Garcia-Molina, and J. Pendersen. Combating web spam with trustrank. In Proceedings of the 30th International VLDB Conference, 2004.
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
O. Kolesnikov, W. Lee, and R. Lipton. Filtering spam using search engines, 2003.
|
| |
15
|
|
| |
16
|
Y. Li, Z. A. Bandar, and D. McLean. An approach for measuring semantic similarity between words using multiple information sources. IEEE Transactions on Knowledge and Data Engineering, 15(4):871--882, 2003.
|
 |
17
|
|
 |
18
|
|
 |
19
|
|
| |
20
|
Open directory project. http://dmoz.org/.
|
| |
21
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University, 1998.
|
| |
22
|
Stanford webbase project. http://webbase.stanford.edu.
|
| |
23
|
|
| |
24
|
M. Williamson. Using dmoz open directory project lists with novell bordermanager, 2003.
|
| |
25
|
J. B. Winer. Statistical principles in experimental design. McGraw Hill, 1962.
|
CITED BY 27
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yabo Xu , Ke Wang , Benyu Zhang , Zheng Chen, Privacy-enhancing personalized web search, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
Shengliang Xu , Shenghua Bao , Ben Fei , Zhong Su , Yong Yu, Exploring folksonomy for personalized search, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
Dikan Xing , Gui-Rong Xue , Qiang Yang , Yong Yu, Deep classifier: automatically categorizing search results into large-scale hierarchies, Proceedings of the international conference on Web search and web data mining, February 11-12, 2008, Palo Alto, California, USA
|
|
|
|
|
|
|
|
|
Christos Makris , Yannis Panagis , Yannis Plegas , Evangelos Sakkopoulos, An integrated web system to facilitate personalized web searching algorithms, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
Ruihua Song , Zhenxiao Luo , Jian-Yun Nie , Yong Yu , Hsiao-Wuen Hon, Identification of ambiguous queries in web search, Information Processing and Management: an International Journal, v.45 n.2, p.216-229, March, 2009
|
|
|
|
|
|
Zhumin Chen , Jun Ma , Jingsheng Lei , Bo Yuan , Li Lian , Ling Song, A cross-language focused crawling algorithm based on multiple relevance prediction strategies, Computers & Mathematics with Applications, v.57 n.6, p.1057-1072, March, 2009
|
|
|
|
|
|
|
|
|
Lin Li , Zhenglu Yang , Masaru Kitsuregawa, Aggregating user-centered rankings to improve web search, Proceedings of the 22nd national conference on Artificial intelligence, p.1884-1885, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|