ACM Home Page
Please provide us with feedback. Feedback
Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web
Full text PdfPdf (835 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Athens, Greece
Pages: 288 - 295  
Year of Publication: 2000
ISBN:1-58113-226-3
Authors
Xiaolan Zhu  Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS
Susan Gauch  Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS
Sponsors
Athens U of Econ & Business : Athens University of Economics and Business
Greek Com Soc : Greek Computer Society
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 131,   Citation Count: 21
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/345508.345602
What is a DOI?

ABSTRACT

Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significant improvement. In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
AltaVista. 1999. http://www.altavista.com.
 
2
S. Brin and L. Page. The Anatomy of a Large- Scale Hypertextual Web Search Engine. http://google, stanford.edu/long321.htm, 1999.
 
3
J.P. Callan, W.B. Croft, and S.M. Harding, "The INQUERY retrieval system." In Proceedings of the 3rd International Conference on Database and Expert System Applications, Valencia, Spain, September, 1995.
 
4
Ciolek. http://www.ciolek.com/WWWVLPages/ QltyDefinitions.html.
 
5
Clearinghouse. Argus Clearinghouse Ratings System. http://clearinghouse.net/ratings.html, 1999.
 
6
G. Crowder and C. Nicholas. "Using Statistical Properties of Text to Create Metadata." In First 1EEE Metadata Conference. April 1996.
 
7
G. Crowder and C. Nicholas.. "Resource Selection in CAFI: an Architecture for Network Information Retrieval." In ACM-SIGIR96 Workshop on Networked Information Retrieval. 22 August, 1996.
 
8
Direct Hit, http://www.directhit.com, 2000a.
 
9
Direct Hit, http://www.directhit.com/about/press/ articles/cnet_shoot.html, 2000b.
 
10
Y. Fan and S. Gauch. "Adaptive Agents for Information Gathering from Multiple, Distributed Information Sources." In 1999 AAAI Symposium on Intelligent Agents in Cyberspace, Stanford University, March, 1999.
 
11
S. Ganch. 1997b. "Cooperative Agents for Concep-tual Search and Browsing of World Wide Web Resources." CAREER/EPSCoR Award number 97-03307, http://www.ittc.ukans. edu/obiwan/, 1997b.
12
 
13
L. Gravano, K. Change, H. Garcia-Molina, C. Lagoze, A. Paepcke. Stanford Protocal Proposal for Internet Retrieval and Search. http://wwwdb.stanford.edu/-gravano/standards, 1997.
 
14
IPL. http://www.ipl.org, 1999.
 
15
 
16
Y. Li, and L. Rafsky "Beyond Relevance Ranking: Hyperlink Vector Voting." In ACM- SIGIR97 Workshop on Networked Information Retrieval. Philadelphia, USA, 31 July 1997.
 
17
Lycos. http//point.lycos.com/categories/index. html, 1999b.
 
18
Lycos. http://www.lycos.com/help/top5-help2. html, 1999c.
 
19
Magellan. http://magellan.mckinley.com, 1999.
 
20
Magellan.http://www.lib.ua.edu/maghelp.htm, 1998.
 
21
 
22
Scout. Internet Scout Project, http://scout.cs. wisc.edu/scout/index.html, 1999a.
 
23
Scout. Scout Report Selection Criteria, http://scout.cs.wisc.edu/scout/report/criteria. html, 1999b.
 
24
E. Selberg. "DISW '96 Query routing and Searching Breakout." In Report of the Distributed Indexing/ Searching Workshop, http://www.w3.org/Search/ 9605-Indexing- Workshop/ReportOutcomes/ S6Groupl.html, Cambridge, Massachusetts, May 1996.
 
25
G. Towell, E.M. Voorhees, N.K. Gupta and B. Johnson-Laird B. "Learning Collection Fusion Strategies for Information Retrieval." In Proceedings of the Twelth Annual Machine Learning Conference, Lake Tahoe, July 1995.
 
26
E. M. Voorhees, "Database Merging Strategies for Searching Public and Privated Collections." In ACM-SIGIR97 Workshop on Networked Inform-ation Retrieval, Philadelphia, USA, 32, July 1997.
 
27
ZDNet. http://www.zdnet.com/yil, 1999.
28

CITED BY  21

Collaborative Colleagues:
Xiaolan Zhu: colleagues
Susan Gauch: colleagues