ACM Home Page
Please provide us with feedback. Feedback
Clustering the topics using TF-IDF for model fusion
Full text PdfPdf (76 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 2nd PhD workshop on Information and knowledge management table of contents
Napa Valley, California, USA
POSTER SESSION: Poster session table of contents
Pages 97-100  
Year of Publication: 2008
ISBN:978-1-60558-257-3
Authors
Muath Alzghool  University of Ottawa, Ottawa, ON, Canada
Diana Inkpen  University of Ottawa, Ottawa, ON, Canada
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 77,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458550.1458569
What is a DOI?

ABSTRACT

Users tend to express their queries in various ways: sometimes they use more general terms, sometimes more specific terms. Information retrieval systems need to be able to accommodate this variety of user needs. Some retrieval models perform better when the queries are general, others perform better when the queries are more specific, and others when a combination is available. In this paper we are looking for a system that will perform well in all these cases, we present a new method for combining the results of different models in order to improve the performance on a difficult task: Information Retrieval from spontaneous speech. Our technique is based on clustering the training topics according to their tf-idf (term frequency-inverse document frequency) properties, and selecting the best models for each cluster. When the system runs on a test topic, the cluster of the topic needs to be determined and the combination of models of this cluster is used. We report improvements on the Malach collection used at CLEF-CLSR 2007.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Alzghool, M. and Inkpen, D. 2007.Model Fusion Experiments for the Cross Language Speech Retrieval Task at CLEF 2007. CLEF-2007 Evaluation (Budapest, Hungary, 2007).
 
2
Garofolo, J. S., Auzanne, C. G. P. and Voorhees, E. M. 2000. The TREC Spoken Document Retrieval Track: A Success Story. RIAO (France, 2000),
3
4
 
5
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C. and Johnson, D. 2005. Terrier Information Retrieval Platform Springer, 2005.
 
6
Pecina, P., Hoffmannov´a, P., Jones, G. J. F., Zhang, Y. and Oard, D. W. 2007. Overview of the CLEF-2007 Cross Language Speech Retrieval Track. CLEF- 2007 Evaluation (Budapest-Hungary, 2007).
 
7
 
8
Shaw, J. A. and Fox, E. A. 1994. Combination of Multiple Searches. National Institute of Standards and Technology Special Publication, 1994.

Collaborative Colleagues:
Muath Alzghool: colleagues
Diana Inkpen: colleagues