ACM Home Page
Please provide us with feedback. Feedback
A combined component approach for finding collection-adapted ranking functions based on genetic programming
Full text PdfPdf (263 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Amsterdam, The Netherlands
SESSION: Learning to rank II table of contents
Pages: 399 - 406  
Year of Publication: 2007
ISBN:978-1-59593-597-7
Authors
Humberto Mossri de Almeida  Federal University of Minas Gerais, Belo Horizonte, Brazil
Marcos André Gonçalves  Federal University of Minas Gerais, Belo Horizonte, Brazil
Marco Cristo  FUCAPI - Analysis, Research and Tech. Innovation Center, Manaus, Brazil
Pável Calado  IST/INESC-ID, Lisboa, Portugal
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 140,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1277741.1277810
What is a DOI?

ABSTRACT

In this paper, we propose a new method to discover collection-adapted ranking functions based on Genetic Programming (GP). Our Combined Component Approach (CCA)is based on the combination of several term-weighting components (i.e.,term frequency, collection frequency, normalization) extracted from well-known ranking functions. In contrast to related work, the GP terminals in our CCA are not based on simple statistical information of a document collection, but on meaningful, effective, and proven components. Experimental results show that our approach was able to outper form standard TF-IDF, BM25 and another GP-based approach in two different collections. CCA obtained improvements in mean average precision up to 40.87% for the TREC-8 collection, and 24.85% for the WBR99 collection (a large Brazilian Web collection), over the baseline functions. The CCA evolution process also was able to reduce the overtraining, commonly found in machine learning methods, especially genetic programming, and to converge faster than the other GP-based approach used for comparison.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Allan, J. P. Callan, F. Feng, and D. Malin. INQUERY and TREC-8. In Proceedings of TREC-8, pages 637--644, Gaithersburg, MD, 1999. NIST Special Publication 500-246.
 
2
 
3
 
4
C. Buckley, A. Singhal, and M. Mitra. New retrieval approaches using smart: TREC 4. In Proceedings of TREC-4, pages 25--48, Gaithersburg, MD, 1996. NIST Special Publication 500-236.
 
5
 
6
 
7
 
8
 
9
 
10
 
11
12
 
13
 
14
P. Pathak, M. Gordon, and W. Fan. Effective information retrieval using genetic algorithms based matching functions adaptation. In Proceedings of the 33rd HICSS, Hawaii, 2000.
15
 
16
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3):129--146, 1976.
 
17
S. E. Robertson and S. Walker. Okapi/keenbow at TREC-8. In Proceedings of TREC-8, pages 151--162, Gaithersburg, MD, 1999. NIST Special Publication 500-246.
 
18
S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In Proceedings of TREC-3, pages 109--126, Gaithersburg, MD, 1995. NIST Special Publication 500-226.
 
19
 
20
21
 
22
 
23
 
24
E. M. Voorhees and D. Harman. Overview of the eighth Text REtrieval Conference (TREC-8). In Proceedings of TREC-8, pages 1--24, Gaithersburg, MD, 1999. NIST Spec.Publ. 500-246.
 
25
26

CITED BY  9

Collaborative Colleagues:
Humberto Mossri de Almeida: colleagues
Marcos André Gonçalves: colleagues
Marco Cristo: colleagues
Pável Calado: colleagues