| AllInOneNews: development and evaluation of a large-scale news metasearch engine |
| Full text |
Pdf
(367 KB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
table of contents
Beijing, China
SESSION: Data processing in the large
table of contents
Pages: 1017 - 1028
Year of Publication: 2007
ISBN:978-1-59593-686-8
|
|
Authors
|
|
King-Lup Liu
|
Webscalers, LLC, Lafayette, LA
|
|
Weiyi Meng
|
Webscalers, LLC, Lafayette, LA
|
|
Jing Qiu
|
Webscalers, LLC, Lafayette, LA
|
|
Clement Yu
|
Webscalers, LLC, Lafayette, LA
|
|
Vijay Raghavan
|
Webscalers, LLC, Lafayette, LA
|
|
Zonghuan Wu
|
Webscalers, LLC, Lafayette, LA
|
|
Yiyao Lu
|
Webscalers, LLC, Lafayette, LA
|
|
Hai He
|
Webscalers, LLC, Lafayette, LA
|
|
Hongkun Zhao
|
Webscalers, LLC, Lafayette, LA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 117, Citation Count: 1
|
|
|
ABSTRACT
AllInOneNews is the largest news metasearch engine in the world, connecting to over 1,000 news sites over 150 countries. Implementing a large-scale metasearch engine like AllInOneNews needs to overcome unique challenges not faced by building small metasearch engines such as developing highly scalable search engine selection techniques. In this paper, we discuss these unique challenges and our solutions to these challenges. We also discuss some novel features of AllInOneNews such as highly automated solution and semantic query match. This paper also reports the results of a comparative evaluation of three commercial news search systems, one search engine - Google News and two metasearch engines - Mamma News and AllInOneNews. Several measures such as effectiveness, diversity and time-sensitivity are used to perform the comparison. Another contribution of this paper is that we introduce a novel scheme to compare multiple news search systems in a combined measure that takes both relevance and time-sensitivity of retrieved information into consideration.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
C. Baumgarten. A probabilistic solutions to the selection and fusion problem in distributed information retrieval. ACM SIGIR Conference, 1999.
|
| |
2
|
M. Bergman. The Deep Web: Surfacing Hidden Value. White Paper of CompletePlanet at http://brightplanet.com/pdf/deepwebwhitepaper.pdf, 2001.
|
| |
3
|
L. Barbosa, J. Freire. Searching for hidden-web databases. 8th International Workshop on WebDB, 2005.
|
 |
4
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
| |
5
|
|
 |
6
|
|
| |
7
|
Y. Fan, and S. Gauch. Adaptive Agents for Information Gathering from Multiple, Distributed Information Sources. 1999 AAAI Symposium on Intelligent Agents in Cyberspace, Stanford University, March 1999.
|
| |
8
|
S. Gauch, G. Wang, and M. Gomez. ProFusion: Intelligent fusion from multiple, distributed search engines. Journal of Universal Computer Science, 1996.
|
| |
9
|
|
| |
10
|
|
| |
11
|
D. Hawking, N. Craswell, and K. Griffiths. Which search engine is best at finding online services? WWW conference, poster, 2001.
|
| |
12
|
|
| |
13
|
K. L. Liu, C. Yu, W. Meng, W. Wu, and N. Rishe. A Statistical Method for Estimating the Usefulness of Text Databases. IEEE TKDE, 2002.
|
| |
14
|
Y. Lu, W. Meng, L. Shu, C. Yu, and K. L. Liu. Evaluation of Result Merging Strategies for Metasearch Engines. WISE Conference, pp.53--66, November 2005.
|
| |
15
|
Y. Lu, W. Meng, W. Zhang, K. L. Liu, and C. Yu. Automatic Extraction of Publication Time from News Search Results. Int'l Workshop on Challenges in Web Information Retrieval and Integration (WIRI2006), April 2006.
|
| |
16
|
|
| |
17
|
W. Meng, K. L. Liu, C. Yu, X. Wang, Y. Chang and N. Rishe. Determining Text Databases to Search in the Internet. VLDB, 1998.
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
 |
21
|
Zonghuan Wu , Weiyi Meng , Clement Yu , Zhuogang Li, Towards a highly-scalable and effective metasearch engine, Proceedings of the 10th international conference on World Wide Web, p.386-395, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372093]
|
| |
22
|
Zonghuan Wu , Vijay Raghavan , Hua Qian , Vuyyuru Rama , Weiyi Meng , Hai He , Clement Yu, Towards Automatic Incorporation of Search Engines into a Large-Scale Metasearch Engine, Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence, p.658, October 13-17, 2003
|
| |
23
|
C. Yu, W. Meng, K.L. Liu, W. Wu and N. Rishe. Efficient and Effective Metasearch for a Large Number of Text Databases. ACM CIKM, November 1999.
|
| |
24
|
|
| |
25
|
C. Yu, and W. Meng. Web Search Technology. In The Internet Encyclopedia edited by Hossein Bidgoli, Wiley Publishers, pp.738--753, 2003.
|
| |
26
|
B. Yuwono, and D. Lee. Server Ranking for Distributed Text Resource Systems on the Internet. DASFAA, 1997, pp.391--400.
|
 |
27
|
Hongkun Zhao , Weiyi Meng , Zonghuan Wu , Vijay Raghavan , Clement Yu, Fully automatic wrapper generation for search engines, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
[doi> 10.1145/1060745.1060760]
|
|