| Sources of evidence for vertical selection |
| Full text |
Pdf
(1.95 MB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
table of contents
Boston, MA, USA
SESSION: Vertical search
table of contents
Pages 315-322
Year of Publication: 2009
ISBN:978-1-60558-483-6
|
|
Authors
|
|
Jaime Arguello
|
Carnegie Mellon University, Pittsburgh, PA, USA
|
|
Fernando Diaz
|
Yahoo! Labs Montreal, Montreal, PQ, Canada
|
|
Jamie Callan
|
Carnegie Mellon University, Pittsburgh, PA, USA
|
|
Jean-Francois Crespo
|
Yahoo! Labs Montreal, Montreal, PQ, Canada
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 66, Downloads (12 Months): 345, Citation Count: 1
|
|
|
ABSTRACT
Web search providers often include search services for domain-specific subcollections, called verticals, such as news, images, videos, job postings, company summaries, and artist profiles. We address the problem of vertical selection, predicting relevant verticals (if any) for queries issued to the search engine's main web search page. In contrast to prior query classification and resource selection tasks, vertical selection is associated with unique resources that can inform the classification decision. We focus on three sources of evidence: (1) the query string, from which features are derived independent of external resources, (2) logs of queries previously issued directly to the vertical, and (3) corpora representative of vertical content. We focus on 18 different verticals, which differ in terms of semantics, media type, size, and level of query traffic. We compare our method to prior work in federated search and retrieval effectiveness prediction. An in-depth error analysis reveals unique challenges across different verticals and provides insight into vertical selection for future work.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David D. Lewis , Abdur Chowdhury , Aleksander Kolcz, Improving Automatic Query Classification via Semi-Supervised Learning, Proceedings of the Fifth IEEE International Conference on Data Mining, p.42-49, November 27-30, 2005
[doi> 10.1109/ICDM.2005.80]
|
 |
2
|
|
| |
3
|
A. Bhattacharyya. On a measure of divergence between two statistical populations defined by probability distributions. Bull. Calcutta Math. Soc., 35:99 -- 109, 1943.
|
| |
4
|
J. Callan. Distributed information retrieval. In W. B. Croft, editor, Advances in Information Retrieval, pages 127--150. Kluwer Academic Publishers, 2000.
|
 |
5
|
|
 |
6
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
 |
7
|
|
 |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
V. Murdock and M. Lalmas, editors. SIGIR 2008 Workshop on Aggregated Search, 2008.
|
 |
13
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Q2C@UST: our winning solution to query classification in KDDCUP 2005, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.100-110, December 2005
[doi> 10.1145/1117454.1117467]
|
 |
14
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148196]
|
| |
15
|
|
| |
16
|
L. Si. Federated Search of Text Search Engines in Uncooperative Environments. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 2006.
|
 |
17
|
|
| |
18
|
I. H. Witten and T. C. Bell. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. Transactions on Information Theory, 37, 1991.
|
 |
19
|
|
| |
20
|
|
|