ACM Home Page
Please provide us with feedback. Feedback
Task-oriented world wide web retrieval by document type classification
Full text PdfPdf (670 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the eighth international conference on Information and knowledge management table of contents
Kansas City, Missouri, United States
Pages: 109 - 113  
Year of Publication: 1999
ISBN:1-58113-146-1
Authors
Katsushi Matsuda  Human Media Res. Labs., NEC 8916-47, Takayama-cho, Ikoma, Nara, 630-0101 Japan
Toshikazu Fukushima  Human Media Res. Labs., NEC 8916-47, Takayama-cho, Ikoma, Nara, 630-0101 Japan
Sponsors
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGIR: ACM Special Interest Group on Information Retrieval
SIGMIS: ACM Special Interest Group on Management Information Systems
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 35,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/319950.319964
What is a DOI?

ABSTRACT

This paper proposes a novel approach to accurately searching Web pages for relevant information in problem solving by specifying a Web document category instead of the user's task. Accessing information from World Wide Web pages as an approach to problem solving has become commonplace. However, such a search is difficult with current search services, since these services only provide keyword-based search methods that are equivalent to narrowing down the target references according to domains. However, problem solving usually involves both a domain and a task. Accordingly, our approach is based on problem solving tasks. To specify a user's problem solving task, we introduce the concept of document types that directly relate to the problem solving tasks; with this approach, users can easily designate problem solving tasks. We implemented PageTypeSearch system based on our approach. Classifier of PageTypeSearch classifies Web pages into the document types by comparing their pages with typical structural characteristics of the types. We compare PageTypeSearch using the document typeindices with a conventional keyword-based search system in experiments. The average precision of the document type-based search is 88.9%, while the average precision of the keyword-based search is 31.2%. Moreover, the number of irrelevant references gathered by our system is about one-thirteenth that of traditional keyword-based search systems. Our approach has practical advantages for problem solving by introducing the viewpoint of tasks to achieve higher performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Wai Lam, Kon F. Low and Chao Y. Ho, Using a Bayesian Network Induction Approach for Text Categorization. In Proceedings of 15th International Joint Conference on Artificial Intelligence, pp.745-750, 1997.
 
4
Robert B. Doorenbos, Oren Etzioni and Daniel S. Weld, A Scalable Comparison-Shopping Agent for the World-Wide Web. University of Washington, Department of Computer Science and Engineering Technical Report UW-CSE-96-01-03, 1996.
 
5
6
7
 
8
 
9


Collaborative Colleagues:
Katsushi Matsuda: colleagues
Toshikazu Fukushima: colleagues