ACM Home Page
Please provide us with feedback. Feedback
An information extraction engine for web discussion forums
Full text PdfPdf (194 KB)
Source International World Wide Web Conference archive
Special interest tracks and posters of the 14th international conference on World Wide Web table of contents
Chiba, Japan
POSTER SESSION: Posters table of contents
Pages: 978 - 979  
Year of Publication: 2005
ISBN:1-59593-051-5
Authors
Hanny Yulius Limanto  Nanyang Technological University, Nanyang Avenue, Singapore
Nguyen Ngoc Giang  Nanyang Technological University, Nanyang Avenue, Singapore
Vo Tan Trung  Nanyang Technological University, Nanyang Avenue, Singapore
Jun Zhang  Nanyang Technological University, Nanyang Avenue, Singapore
Qi He  Nanyang Technological University, Nanyang Avenue, Singapore
Nguyen Quang Huy  Nanyang Technological University, Nanyang Avenue, Singapore
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 97,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1062745.1062827
What is a DOI?

ABSTRACT

In this poster, we present an information extraction engine for web-based forums. The engine analyzes the HTML files crawled from web forums, deduces the wrapper (template) of the pages and extracts the information about posts (e.g., author, title, content, number of replies and views, etc.). Extraction is an important module for forum search engine, since it helps to understand the content of a forum HTML page and facilitates ranking during retrieval. We discuss the system architecture of the extraction engine in the context of a forum search engine and present various components in the extraction engine. We also introduce briefly the extraction process and discuss some implementation issues.




Collaborative Colleagues:
Hanny Yulius Limanto: colleagues
Nguyen Ngoc Giang: colleagues
Vo Tan Trung: colleagues
Jun Zhang: colleagues
Qi He: colleagues
Nguyen Quang Huy: colleagues