|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
ABSTRACT
Topical classification of web queries has drawn recent interest because of the promise it offers in improving retrieval effectiveness and efficiency. However, much of this promise depends on whether classification is performed before or after the query is used to retrieve documents. We examine two previously unaddressed issues in query classification: pre versus post-retrieval classification effectiveness and the effect of training explicitly from classified queries versus bridging a classifier trained using a document taxonomy. Bridging classifiers map the categories of a document taxonomy onto those of a query classification problem to provide sufficient training data. We find that training classifiers explicitly from manually classified queries outperforms the bridged classifier by 48% in F1 score. Also, a pre-retrieval classifier using only the query terms performs merely 11% worse than the bridged classifier which requires snippets from retrieved documents. REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references. INDEX TERMS
Primary Classification:
General Terms:
Collaborative Colleagues:
|
|||||||||||||||||||||||