| Does WT10g look like the web? |
| Full text |
Pdf
(89 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Tampere, Finland
POSTER SESSION: Poster session
table of contents
Pages: 423 - 424
Year of Publication: 2002
ISBN:1-58113-561-0
|
|
Author
|
|
Ian Soboroff
|
National Institute of Standards and Technology, Gaithersburg, MD
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 43, Citation Count: 10
|
|
|
ABSTRACT
We measure the WT10g test collection, used in the TREC-9 and TREC 2001 Web Tracks, with common measures used in the web topology community, in order to see if WT10g "looks like" the web. This is not an idle question; characteristics of the web, such as power law relationships, diameter, and connected components have all been observed within the scope of general web crawls, constructed by blindly following links. In contrast, WT10g was carved out from a larger crawl specifically to be a web search test collection within the reach of university researchers. Does such a collection retain the properties of the larger web? In the case of WT10g, yes.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Andrei Broder , Ravi Kumar , Farzin Maghoul , Prabhakar Raghavan , Sridhar Rajagopalan , Raymie Stata , Andrew Tomkins , Janet Wiener, Graph structure in the Web, Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking, p.309-320, June 2000, Amsterdam, The Netherlands
|
| |
3
|
David M. Pennock, Gary W. Flake, Steve Lawrence, Eric J. Glover, and C. Lee Giles. Winners don't take all: Characterizing the competition for links on the web. Proceedings of the National Academy of Sciences, 99(8):5207--5211, April 2002.
|
 |
4
|
|
|