ACM Home Page
Please provide us with feedback. Feedback
The SPIRIT collection: an overview of a large web collection
Full text PdfPdf (578 KB)
Source ACM SIGIR Forum archive
Volume 38 ,  Issue 2  (December 2004) table of contents
Pages: 57 - 61  
Year of Publication: 2004
ISSN:0163-5840
Authors
Hideo Joho  University of Sheffield
Mark Sanderson  University of Sheffield
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 20,   Citation Count: 8
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1041394.1041395
What is a DOI?

ABSTRACT

A large scale collection of web pages has been essential for research in information retrieval and related areas. This paper provides an overview of a large web collection used in the SPIRIT project for the design and testing of spatially-aware retrieval systems. Several statistics are derived and presented to show the characteristics of the collection.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Cacheda, F., Plachouras, V. & Ounis:, I. (2004). "Performance Analysis of Distributed Architectures to Index One Terabyte of Text". In: McDonald, S. & Tait, J. (eds.), Advances in Information Retrieval, Proceedings of the 26th European Conference on IR Research, Lecture Notes in Computer Science, Vol. 2997, Sunderland, UK. pp. 394--408. Springer.
2
 
3
Craswell, N., Hawking, D., Wilkinson, R. & Wu, M. (2003). "Overview of the TREC 2003 Web Track". In: Voorheer, E. (ed.), NIST Special Publication 500--255:The Twelfth Text REtrieval Conference (TREC 2003), Gaithersburg, MD. pp. 78--92. NIST.
 
4
5

CITED BY  8
 
 
 
 
 
Collaborative Colleagues:
Hideo Joho: colleagues
Mark Sanderson: colleagues