ACM Home Page
Please provide us with feedback. Feedback
Structure and content analysis for html medical articles: a hidden markov model approach
Full text PdfPdf (937 KB)
Source
Document Engineering archive
Proceedings of the 2007 ACM symposium on Document engineering table of contents
Winnipeg, Manitoba, Canada
SESSION: Classification and machine learning table of contents
Pages: 199 - 201  
Year of Publication: 2007
ISBN:978-1-59593-776-6
Authors
Jie Zou  Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, MD
Daniel Le  Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, MD
George R. Thoma  Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, MD
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 39,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1284420.1284468
What is a DOI?

ABSTRACT

We describe ongoing research on segmenting and labeling HTML medical journal articles. In contrast to existing approaches in which HTML tags usually serve as strong indicators, we seek to minimize dependence on HTML tags. Designing logical component models for general Web pages is a challenging task. However, in the narrow domain of online journal articles, we show that the HTML document, modeled with a Hidden Markov Model, can be accurately segmented into logical zones.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Forney, G.D. Jr., The Viterbi Algorithm, Proceedings of the IEEE, 61, 3, 1973, 268--278.
 
3
4
 
5
6
7

Collaborative Colleagues:
Jie Zou: colleagues
Daniel Le: colleagues
George R. Thoma: colleagues