| A combined component approach for finding collection-adapted ranking functions based on genetic programming |
| Full text |
Pdf
(263 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Amsterdam, The Netherlands
SESSION: Learning to rank II
table of contents
Pages: 399 - 406
Year of Publication: 2007
ISBN:978-1-59593-597-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 13, Downloads (12 Months): 140, Citation Count: 9
|
|
|
ABSTRACT
In this paper, we propose a new method to discover collection-adapted ranking functions based on Genetic Programming (GP). Our Combined Component Approach (CCA)is based on the combination of several term-weighting components (i.e.,term frequency, collection frequency, normalization) extracted from well-known ranking functions. In contrast to related work, the GP terminals in our CCA are not based on simple statistical information of a document collection, but on meaningful, effective, and proven components. Experimental results show that our approach was able to outper form standard TF-IDF, BM25 and another GP-based approach in two different collections. CCA obtained improvements in mean average precision up to 40.87% for the TREC-8 collection, and 24.85% for the WBR99 collection (a large Brazilian Web collection), over the baseline functions. The CCA evolution process also was able to reduce the overtraining, commonly found in machine learning methods, especially genetic programming, and to converge faster than the other GP-based approach used for comparison.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Allan, J. P. Callan, F. Feng, and D. Malin. INQUERY and TREC-8. In Proceedings of TREC-8, pages 637--644, Gaithersburg, MD, 1999. NIST Special Publication 500-246.
|
| |
2
|
|
| |
3
|
|
| |
4
|
C. Buckley, A. Singhal, and M. Mitra. New retrieval approaches using smart: TREC 4. In Proceedings of TREC-4, pages 25--48, Gaithersburg, MD, 1996. NIST Special Publication 500-236.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
Anísio Lacerda , Marco Cristo , Marcos André Gonçalves , Weiguo Fan , Nivio Ziviani , Berthier Ribeiro-Neto, Learning to advertise, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148265]
|
| |
13
|
|
| |
14
|
P. Pathak, M. Gordon, and W. Fan. Effective information retrieval using genetic algorithms based matching functions adaptation. In Proceedings of the 33rd HICSS, Hawaii, 2000.
|
 |
15
|
|
| |
16
|
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3):129--146, 1976.
|
| |
17
|
S. E. Robertson and S. Walker. Okapi/keenbow at TREC-8. In Proceedings of TREC-8, pages 151--162, Gaithersburg, MD, 1999. NIST Special Publication 500-246.
|
| |
18
|
S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In Proceedings of TREC-3, pages 109--126, Gaithersburg, MD, 1995. NIST Special Publication 500-226.
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
E. M. Voorhees and D. Harman. Overview of the eighth Text REtrieval Conference (TREC-8). In Proceedings of TREC-8, pages 1--24, Gaithersburg, MD, 1999. NIST Spec.Publ. 500-246.
|
| |
25
|
|
 |
26
|
|
CITED BY 9
|
|
Tao Qin , Tie-Yan Liu , Xu-Dong Zhang , De-Sheng Wang , Wen-Ying Xiong , Hang Li, Learning to rank relational objects and its application to web search, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
Jun Xu , Tie-Yan Liu , Min Lu , Hang Li , Wei-Ying Ma, Directly optimizing evaluation measures in learning to rank, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
Fabiano Atalla , Daniel Miranda , Jussara Almeida , Marcos André Gonçalves , Virgilio Almeida, Analyzing the impact of churn and malicious behavior on the quality of peer-to-peer web search, Proceedings of the 2008 ACM symposium on Applied computing, March 16-20, 2008, Fortaleza, Ceara, Brazil
|
|
|
|
|
|
Cristiano D. Ferreira , Ricardo da S. Torres , Marcos André Gonçalves , Weiguo Fan, Image retrieval with relevance feedback based on genetic programming, Proceedings of the 23rd Brazilian symposium on Databases, October 13-17, 2008, Campinas, Sao Paulo, Brazil
|
|
|
|
|
|
|
|