ACM Home Page
Please provide us with feedback. Feedback
Effective web crawling
Full text PdfPdf (113 KB)
Source ACM SIGIR Forum archive
Volume 39 ,  Issue 1  (June 2005) table of contents
COLUMN: Dissertation abstracts table of contents
Pages: 55 - 56  
Year of Publication: 2005
ISSN:0163-5840
Author
Carlos Castillo  University of Chile
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 121,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1067268.1067287
What is a DOI?

ABSTRACT

The key factors for the success of the World Wide Web are its large size and the lack of a centralized control over its contents. Both issues are also the most important source of problems for locating information. The Web is a context in which traditional Information Retrieval methods are challenged, and given the volume of the Web and its speed of change, the coverage of modern search engines is relatively small. Moreover, the distribution of quality is very skewed, and interesting pages are scarce in comparison with the rest of the content.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Ricardo Baeza-Yates and Carlos Castillo. Relating Web characteristics with link based Web page ranking. In Proceedings of String Processing and Information Retrieval, pages 21--32, Laguna San Rafael, Chile, November 2001. IEEE CS Press.
 
2
Ricardo Baeza-Yates and Carlos Castillo. Balancing volume, quality and freshness in web crawling. In Soft Computing Systems - Design, Management and Applications, pages 565--572, Santiago, Chile, 2002. IOS Press Amsterdam.
 
3
Ricardo Baeza-Yates and Carlos Castillo. Crawling the infinite Web: five levels are enough. In Proceedings of the third Workshop on Web Graphs (WAW), volume 3243 of Lecture Notes in Computer Science, pages 156--167, Rome, Italy. October 2004. Springer.
 
4
Ricardo Baeza-Yates, Carlos Castillo, and Felipe Saint-Jean. Web Dynamics, chapter Web Dynamics, Structure and Page Quality, pages 93--109. Springer, 2004.
 
5
 
6
Ricardo A. Baeza-Yates, Javier Ruiz del Solar, Rodrigo Verschae, Carlos Castillo, and Carlos A. Hurtado. Content-based image retrieval and characterization on specific Web collections. In Third international conference on image and video retrieval (CIVR), volume 3115 of Lecture Notes in Computer Science, pages 189--198, Dublin, Ireland, July 2004. Springer.
 
7
 
8
Carlos Castillo and Ricardo Baeza-Yates. A new crawling model. In Poster proceedings of the eleventh conference on World Wide Web, Honolulu, Hawaii, USA, May 2002. (Extended Poster).
 
9
 
10
A. Jaimes, J. Ruiz del Solar, R. Verschae, R. Baeza-Yates, C. Castillo, D. Yaksic, and E. Davis. On the image content of a Web segment: Chile as a case study. Journal of Web Engineering, 3(2):153--168, 2004.