ACM Home Page
Please provide us with feedback. Feedback
Dynamic index pruning for effective caching
Full text PdfPdf (111 KB)
Source
Conference on Information and Knowledge Management archive
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management table of contents
Lisbon, Portugal
POSTER SESSION: Poster session table of contents
Pages 987-990  
Year of Publication: 2007
ISBN:978-1-59593-803-9
Authors
Yohannes Tsegay  RMIT University, Melbourne, Australia
Andrew Turpin  RMIT University, Melbourne, Australia
Justin Zobel  RMIT University, Melbourne, Australia
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 85,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1321440.1321592
What is a DOI?

ABSTRACT

RAM and dynamic pruning schemes to reduce query evaluation times. While only a small portion of lists are processed with dynamic pruning, current systems still store the entire inverted list in cache. In this paper we investigate caching only the pieces of the inverted lists that are actually used to answer a query during dynamic pruning. We examine an LRU cache model, and two recently proposed models. We also introduce a new dynamic pruning scheme for impact-ordered inverted lists.

Using two large web collections and corresponding query logs we show that, using an LRU cache, our new pruning scheme reduces the number of disk accesses during query processing time by 7%-15% over the state-of-the-art impact-ordered baseline, without reducing answer quality. Surprisingly, however, we discover that using our new pruning scheme makes little difference to disk traffic when the more sophisticated caching schemes are employed.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
V. N. Anh. Impact-based Document Retrieval. PhD thesis, Department of Computer Science and Software Engineering, The University of Melbourne, 2004.
2
 
3
S. Büttcher and C. L. A. Clarke. A document-centric approach to static index pruning in text retrieval systems. In CIKM, 2006.
 
4
5
6
 
7
S. Garcia. Search Engine Optimisation Using Past Queries. PhD thesis, School of Computer Science and IT, RMIT, 2007.
8
 
9
D. Hawking, N. Craswell, and P. Thistlewaite. Overview of TREC-7 Very Large Collection Track. In Proc. of TREC-7, 1998.
 
10
B. J. Jansen and A. Spink. An analysis of web documents retrieved and viewed. In H. R. Arabnia and Y. Mun, editors, International Conference on Internet Computing, 2003.
 
11
 
12
13
 
14
15
 
16
17


Collaborative Colleagues:
Yohannes Tsegay: colleagues
Andrew Turpin: colleagues
Justin Zobel: colleagues