ACM Home Page
Please provide us with feedback. Feedback
Retrieval and feedback models for blog feed search
Full text PdfPdf (386 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: Web-search--2 table of contents
Pages 347-354  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Jonathan L. Elsas  Carnegie Mellon University, Pittsburgh, PA, USA
Jaime Arguello  Carnegie Mellon University, Pittsburgh, PA, USA
Jamie Callan  Carnegie Mellon University, Pittsburgh, PA, USA
Jaime G. Carbonell  Carnegie Mellon University, Pittsburgh, PA, USA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 372,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390394
What is a DOI?

ABSTRACT

Blog feed search poses different and interesting challenges from traditional ad hoc document retrieval. The units of retrieval, the blogs, are collections of documents, the blog posts. In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TREC 2007 Blog Distillation task[12]. We also show that typical query expansion techniques such as pseudo-relevance feedback using the blog corpus do not provide any significant performance improvement and in many cases dramatically hurt performance. We perform an in-depth analysis of the behavior of pseudo-relevance feedback for this task and develop a novel query expansion technique using the link structure in Wikipedia. This query expansion technique provides significant and consistent performance improvements for this task, yielding a 22% and 14% improvement in MAP over the unexpanded query for our baseline and federated algorithms respectively.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. Arguello, J. L. Elsas, J. Callan, and J. G. Carbonell. Document representation and query expansion models for blog recommendation. In Proc. of the 2nd Intl. Conf. on Weblogs and Social Media (ICWSM), 2008.
 
2
 
3
J. Callan. Distributed information retrieval. In W. Croft, editor, Advances in Information Retrieval, pages 127--150. Kluwer Academic Publishers, 2000.
 
4
C. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2004 terabyte track. In Proc. of the 2004 Text Retrieval Conf., 2004.
 
5
C. Clarke, F. Scholer, and I. Soboroff. Overview of the TREC 2005 terabyte track. In Proc. of the 2005 Text Retrieval Conf., 2005.
6
7
 
8
D. Hannah, C. Macdonald, J. Peng, B. He, and I. Ounis. University of Glasgow at TREC 2007: Experiments with blog and enterprise tracks with terrier. In Proc. of the 2007 Text Retrieval Conf., 2007.
 
9
P. Kolari, A. Java, and T. Finin. Characterizing the splogosphere. In Proc. of the 3rd Annl. Workshop on Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 15th World Wide Web Conf., 2006.
10
 
11
C. Macdonal and I. Ounis. The TREC blog06 collection: Creating and analysing a blog test collection. Technical Report TR-2006-224, Department of Computing Science, U. of Glasgow, 2006.
 
12
C. Macdonald, I. Ounis, and I. Soboroff. Overview of the TREC 2007 blog track. In Proc. of the 2007 Text Retrieval Conf., 2007.
13
 
14
D. Metzler, T. Strohman, H. Turtle, and W. Croft. Indri at TREC 2004: Terabyte track. In Proc. of the 2004 Text Retrieval Conf., 2004.
 
15
D. Metzler, T. Strohman, Y. Zhou, and W. Croft. Indri at TREC 2005: Terabyte track. In Proc. of the 2005 Text Retrieval Conf., 2005.
 
16
J. Seo and W. B. Croft. Umass at trec 2007 blog distillation task. In Proc. of the 2007 Text Retrieval Conf., 2007.
17
 
18
I. Soboroff, A. de Vries, and N. Craswell. Overview of the trec 2006 enterprise track. In Proc. of the 2006 Text Retrieval Conf., 2006.
19

CITED BY  8

Collaborative Colleagues:
Jonathan L. Elsas: colleagues
Jaime Arguello: colleagues
Jamie Callan: colleagues
Jaime G. Carbonell: colleagues