ACM Home Page
Please provide us with feedback. Feedback
Computing block importance for searching on web sites
Full text PdfPdf (467 KB)
Source
Conference on Information and Knowledge Management archive
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management table of contents
Lisbon, Portugal
SESSION: Web retrieval I (IR) table of contents
Pages 165-174  
Year of Publication: 2007
ISBN:978-1-59593-803-9
Authors
David Fernandes  Federal University of Minas Gerais, Belo Horizonte, Brazil
Edleno S. de Moura  Federal University of Amazonas, Manaus, Brazil
Berthier Ribeiro-Neto  Federal University of Minas Gerais, Belo Horizonte, Brazil
Altigran S. da Silva  Federal University of Amazonas, Manaus, Brazil
Marcos André Gonçalves  Federal University of Minas Gerais, Belo Horizonte, Brazil
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 102,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1321440.1321466
What is a DOI?

ABSTRACT

In this paper we consider the problem of using the block structure of a Web page to improve ranking results when searching for information on Web sites. Given the block structure of the Web pages as input, we propose a method for computing the importance of each block (in the form of block weights) in a Web collection. As we show through experiments, the deployment of our method may allow a significant improvement in the quality of search results. We ran experiments to compare the quality of search results when using our method to the quality obtained when using no structure information. When compared to a ranking method that considered pages as monolithic units, our block-based ranking method led to improvements in the quality of search results in experiments with two sites with heterogeneous structures. Further, our method does not increase the cost of processing queries when compared to the systems using no structural information.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
K. Ahnizeret, D. Fernandes, J. M. B. Cavalcanti, E. S. de Moura, and A. Silva. Information retrieval aware web site modelling and generation. In ER '04: Proceedings of the 23th Internacional Conference on Conceptual Modeling, pages 402--419, Shangai, China, 2004.
2
3
 
4
D. Cai, S. Yu, J. Wen, and W. Ma. Vips: a vision based page segmentation algorithm. Technical Report MSR-TR-2003-79, Microsoft Technical Report, 2003.
5
 
6
7
8
9
 
10
W. ching Wong and A. W.-C. Fu. Finding structure and characteristics of web documents for classification. In SIGMOD '2000: Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), pages 96--105, Dallas, TX., USA, 2000. ACM Press.
 
11
D. Hawking and N. Craswell. Overview of TREC-7 very large collection track. In Proceedings of the 7th Text Retrieval Conference, pages 91--104, Gaithersburg, MD, 1998.
 
12
13
14
15
16
17
18
19


Collaborative Colleagues:
David Fernandes: colleagues
Edleno S. de Moura: colleagues
Berthier Ribeiro-Neto: colleagues
Altigran S. da Silva: colleagues
Marcos André Gonçalves: colleagues