ACM Home Page
Please provide us with feedback. Feedback
Learning effective ranking functions for newsgroup search
Full text PdfPdf (281 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Sheffield, United Kingdom
SESSION: Machine learning for IR table of contents
Pages: 394 - 401  
Year of Publication: 2004
ISBN:1-58113-881-4
Authors
Wensi Xi  Virginia Polytechnic Institute and State University, Blacksburg, VA
Jesper Lind  Microsoft Research, Redmond, WA
Eric Brill  Microsoft Research, Redmond, WA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 86,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1008992.1009060
What is a DOI?

ABSTRACT

Web communities are web virtual broadcasting spaces where people can freely discuss anything. While such communities function as discussion boards, they have even greater value as large repositories of archived information. In order to unlock the value of this resource, we need an effective means for searching archived discussion threads. Unfortunately the techniques that have proven successful for searching document collections and the Web are not ideally suited to the task of searching archived community discussions. In this paper, we explore the problem of creating an effective ranking function to predict the most relevant messages to queries in community search. We extract a set of predictive features from the thread trees of newsgroup messages as well as features of message authors and lexical distribution within a message thread. Our final results indicate that when using linear regression with this feature set, our search system achieved a 28.5% performance improvement compared to our baseline system.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
5
 
6
7
8
 
9
S. Fujita. "More Reflections on "Aboutness"". TREC-2001 Evaluation Experiments at Justsystem. In Proceedings of the Tenth Text Retrieval Conference (TREC 2001). Gaithersburg, Maryland, NIST Special Publication 500--250. 2002.
 
10
F.C. Gey, A. Chen, J. He, and J. Meggs. "Logistic regression at TREC4: Probabilistic retrieval from full text document collections." In Proceedings of the Fourth Text Retrieval Conference (TREC 4). Gaithersburg, Maryland, NIST Special Publication 500--236. 1996.
 
11
 
12
D.K. Harman, "Overview of the Fourth Text Retrieval Conference (TREC-4)," In Proceedings of the Fourth Text Retrieval Conference (TREC-4), Gaithersburg, Maryland, NIST Special Publication 500-236, pp. 1--23, 1995.
13
 
14
G. Kazai, M. Lalmas and T. Roelleke. "A Model for the Representation and Focused Retrieval of Structured Documents based on Fuzzy Aggregation", In Proceedings of the 8th International Symposium on String Processing and Information Retrieval, Laguna de San Rafael, Chile, pp. 123--135, 2001.
15
16
17
 
18
D. Lewis "Applying Support Vector Machines to the TREC-2001 Batch Filtering and Routing Tasks". In Proceedings of the Tenth Text Retrieval Conference (TREC 2001). Gaithersburg, Maryland, NIST Special Publication 500--250. 2002.
 
19
 
20
A. Moffat, R. Sack-Davis, R. Wilkinson, and Zobel, J. "Retrieval of Partial Document." In Proceedings of the Second Text Retrieval Conference (TREC-2), pp.181--190. NIST Special Publication pp. 500--215, 1994.
21
 
22
S.E. Robertson, "Overview of the Okapi Projects, Journal of Documentation, Vol. 53, No.1, pp. 3--7, 1997.
23
 
24
J.A. Shaw & E.A. Fox, "Combination of multiple searches", In Proceedings of the 3rd Text Retrieval Conference (TREC-3). Gaithersburg, Maryland: NIST Special Publication 500-250, pp.105--107, 1995.
 
25
C.C. Vogt and G.W. Cottrell. "Fusion via linear combination for the routing problem". In Proceedings of the Sixth Text Retrieval Conference (TREC 2001). NIST Special Publication 500--250. 1998.
 
26
 
27
W. Xi and E. A. Fox. "Machine Learning Approach for Homepage Finding task". In Proceedings of the Tenth Text Retrieval Conference (TREC 2001). Gaithersburg, Maryland, NIST Special Publication 500--250. 2002.
 
28
W. Xi, "Combining Multiple Source of Evidence for Information Retrieval," Master Thesis, Nanyang Technological University, Singapore, 2000.
29
30
 
31

CITED BY  9

Collaborative Colleagues:
Wensi Xi: colleagues
Jesper Lind: colleagues
Eric Brill: colleagues