ACM Home Page
Please provide us with feedback. Feedback
Query type classification for web document retrieval
Full text PdfPdf (225 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Web table of contents
Pages: 64 - 71  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
In-Ho Kang  KAIST
GilChang Kim  KAIST
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 41,   Downloads (12 Months): 310,   Citation Count: 36
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860449
What is a DOI?

ABSTRACT

The heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web documents are not enough to find good answer documents. Link information and URL information compensates for the insufficiencies of content information. However, static combination of multiple evidences may lower the retrieval performance. We need different strategies to find target documents according to a query type. We can classify user queries as three categories, the topic relevance task, the homepage finding task, and the service finding task. In this paper, a user query classification scheme is proposed. This scheme uses the difference of distribution, mutual information, the usage rate as anchor texts, and the POS information for the classification. After we classified a user query, we apply different algorithms and information for the better results. For the topic relevance task, we emphasize the content information, on the other hand, for the homepage finding task, we emphasize the Link information and the URL information. We could get the best performance when our proposed classification method with the OKAPI scoring algorithm was used.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
 
5
W. B. Croft. Combining approaches to information retrieval. In Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, pages 1--36. Kluwer Academic Publishers, 2000.
 
6
CSIRO. Web research collections - trec web track. www.ted.cmis.csiro.au /TRECWeb/, 2001.
 
7
E. Fox and J. Shaw. Combination of multiple searches. In Text REtrieval Conference (TREC-1), pages 243--252, 1993.
 
8
D. Hawking and N. Craswell. Overview of the trec-2001 web track. In Text REtrieval Conference (TREC-10), pages 61--67, 2001.
 
9
E. Jaynes. Information theory and statistical mechanics. Physics Review, 106(4):620--630, 1957.
10
 
11
 
12
P. Ogilvie and J. Callan. Experiments using the lemur toolkit. In Text REtrieval Conference (TREC-10), pages 103--108, 2001.
 
13
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
 
14
J. M. Ponte. Language models for relevance feedback. In W. B. Croft, editor, Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, pages 73--95. Kluwer Academic Publishers, 2000.
 
15
S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at trec-3. In Text REtrieval Conference (TREC-2), pages 109--126, 1994.
 
16
T. Westerveld, W. Kraaij, and D. Hiemstra. Retrieving web pages using content, links, urls and anchors. In Text REtrieval Conference (TREC-10), pages 663--672, 2001.
 
17
K. Yang. Combining text and link-based retrieval methods for web ir. In Text REtrieval Conference (TREC-10), pages 609--618, 2001.

CITED BY  36

Collaborative Colleagues:
In-Ho Kang: colleagues
GilChang Kim: colleagues