ACM Home Page
Please provide us with feedback. Feedback
Bootstrapping for example-based data extraction
Full text PdfPdf (2.59 MB)
Source Conference on Information and Knowledge Management archive
Proceedings of the tenth international conference on Information and knowledge management table of contents
Atlanta, Georgia, USA
Session: String Match and Text Extraction table of contents
Pages: 371 - 378  
Year of Publication: 2001
ISBN:1-58113-436-3
Authors
Paulo B. Golgher  Federal University of Minas Gerais, Belo Horizonte MG Brazil
Altigran S. da Silva  Federal University of Minas Gerais, Belo Horizonte MG Brazil
Alberto H. F. Laender  Federal University of Minas Gerais, Belo Horizonte MG Brazil
Berthier Ribeiro-Neto  Federal University of Minas Gerais, Belo Horizonte MG Brazil
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 30,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502585.502648
What is a DOI?

ABSTRACT

The effortless generation of wrappers for Web data sources is a crucial task if proper access to the huge amount of semi-structured data on the Web is to be granted. In particular, the development of strategies for wrapper generation based on user-given examples is currently one of the most promising research directions in Web data extraction. In this paper we show how to use a pre-existing data repository to automatically generate examples and allow full automated example-based data extraction. To demonstrate the feasibility of our approach we provide a number of results obtained from experiments we carried out and discuss how our ideas can be used to improve extraction rates and for providing resilience and adaptiveness for example-based generated wrappers.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
GOLGHER, P. B. Bootstrapping for Example-based Data Extraction. Master's thesis, Deptartment of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Brazil, 2001.
 
5
 
6
KNOBLOCK, C. A., LERMAN, K., MINTON, S., AND MUSLEA, I. Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach. IEEE Data Engineering Bulletin 23, 4 (2000), 3341.
 
7
8
 
9

CITED BY  7

Collaborative Colleagues:
Paulo B. Golgher: colleagues
Altigran S. da Silva: colleagues
Alberto H. F. Laender: colleagues
Berthier Ribeiro-Neto: colleagues