| Classification-based resource selection |
| Full text |
Pdf
(482 KB)
|
Source
|
Conference on Information and Knowledge Management
archive
Proceeding of the 18th ACM conference on Information and knowledge management
table of contents
Hong Kong, China
SESSION: IR ranking and retrieval models II
table of contents
Pages: 1277-1286
Year of Publication: 2009
ISBN:978-1-60558-512-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 47, Citation Count: 0
|
|
|
ABSTRACT
In some retrieval situations, a system must search across multiple collections. This task, referred to as federated search, occurs for example when searching a distributed index or aggregating content for web search. Resource selection refers to the subtask of deciding, given a query, which collections to search. Most existing resource selection methods rely on evidence found in collection content. We present an approach to resource selection that combines multiple sources of evidence to inform the selection decision. We derive evidence from three different sources: collection documents, the topic of the query, and query click-through data. We combine this evidence by treating resource selection as a multiclass machine learning problem. Although machine learned approaches often require large amounts of manually generated training data, we present a method for using automatically generated training data. We make use of and compare against prior resource selection work and evaluate across three experimental testbeds.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Jaime Arguello , Fernando Diaz , Jamie Callan , Jean-Francois Crespo, Sources of evidence for vertical selection, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
[doi> 10.1145/1571941.1571997]
|
| |
2
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David D. Lewis , Abdur Chowdhury , Aleksander Kolcz, Improving Automatic Query Classification via Semi-Supervised Learning, Proceedings of the Fifth IEEE International Conference on Data Mining, p.42-49, November 27-30, 2005
[doi> 10.1109/ICDM.2005.80]
|
| |
3
|
A. Bhattacharyya. On a measure of divergence between two statistical populations defined by probability distributions. Bull. Calcutta Math. Soc., 35:99--109, 1943.
|
 |
4
|
|
 |
5
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
 |
6
|
|
 |
7
|
|
| |
8
|
C. T. Fallen and G. B. Newby. Partitioning the gov2 corpus by internet domain name: A result-set merging experiment. In TREC 2006, 2006.
|
 |
9
|
|
| |
10
|
|
| |
11
|
H. Je reys. An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 186(1007):453--461, 1946.
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Q2C@UST: our winning solution to query classification in KDDCUP 2005, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.100-110, December 2005
[doi> 10.1145/1117454.1117467]
|
 |
18
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148196]
|
| |
19
|
M. Shokouhi. Central rank based collection selection in uncooperative distributed information retrieval. In ECIR 2007, pages 160--172. ACM, 2007.
|
| |
20
|
M. Shokouhi, F. Scholer, and J. Zobel. Sample sizes for query probing in uncooperative distributed information retrieval. In APWeb 2006, pages 63--75. Springer, 2006.
|
 |
21
|
|
 |
22
|
|
 |
23
|
Luo Si , Rong Jin , Jamie Callan , Paul Ogilvie, A language modeling framework for resource selection and results merging, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
[doi> 10.1145/584792.584856]
|
 |
24
|
|
 |
25
|
|
 |
26
|
|
|