|
ABSTRACT
The heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web documents are not enough to find good answer documents. Link information and URL information compensates for the insufficiencies of content information. However, static combination of multiple evidences may lower the retrieval performance. We need different strategies to find target documents according to a query type. We can classify user queries as three categories, the topic relevance task, the homepage finding task, and the service finding task. In this paper, a user query classification scheme is proposed. This scheme uses the difference of distribution, mutual information, the usage rate as anchor texts, and the POS information for the classification. After we classified a user query, we apply different algorithms and information for the better results. For the topic relevance task, we emphasize the content information, on the other hand, for the homepage finding task, we emphasize the Link information and the URL information. We could get the best performance when our proposed classification method with the OKAPI scoring algorithm was used.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
W. B. Croft. Combining approaches to information retrieval. In Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, pages 1--36. Kluwer Academic Publishers, 2000.
|
| |
6
|
CSIRO. Web research collections - trec web track. www.ted.cmis.csiro.au /TRECWeb/, 2001.
|
| |
7
|
E. Fox and J. Shaw. Combination of multiple searches. In Text REtrieval Conference (TREC-1), pages 243--252, 1993.
|
| |
8
|
D. Hawking and N. Craswell. Overview of the trec-2001 web track. In Text REtrieval Conference (TREC-10), pages 61--67, 2001.
|
| |
9
|
E. Jaynes. Information theory and statistical mechanics. Physics Review, 106(4):620--630, 1957.
|
 |
10
|
|
| |
11
|
|
| |
12
|
P. Ogilvie and J. Callan. Experiments using the lemur toolkit. In Text REtrieval Conference (TREC-10), pages 103--108, 2001.
|
| |
13
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.
|
| |
14
|
J. M. Ponte. Language models for relevance feedback. In W. B. Croft, editor, Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, pages 73--95. Kluwer Academic Publishers, 2000.
|
| |
15
|
S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at trec-3. In Text REtrieval Conference (TREC-2), pages 109--126, 1994.
|
| |
16
|
T. Westerveld, W. Kraaij, and D. Hiemstra. Retrieving web pages using content, links, urls and anchors. In Text REtrieval Conference (TREC-10), pages 663--672, 2001.
|
| |
17
|
K. Yang. Combining text and link-based retrieval methods for web ir. In Text REtrieval Conference (TREC-10), pages 609--618, 2001.
|
CITED BY 36
|
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Q2C@UST: our winning solution to query classification in KDDCUP 2005, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.100-110, December 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David Grossman , David D. Lewis , Abdur Chowdhury , Aleksandr Kolcz, Automatic web query classification using labeled and unlabeled training data, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Query enrichment for web-query classification, ACM Transactions on Information Systems (TOIS), v.24 n.3, p.320-352, July 2006
|
|
|
|
|
|
Yunyao Li , Rajasekar Krishnamurthy , Shivakumar Vaithyanathan , H. V. Jagadish, Getting work done on the web: supporting transactional queries, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
Honghua (Kathy) Dai , Lingzhi Zhao , Zaiqing Nie , Ji-Rong Wen , Lee Wang , Ying Li, Detecting online commercial intention (OCI), Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
Yumao Lu , Fuchun Peng , Xin Li , Nawaaz Ahmed, Coupling feature selection and machine learning methods for navigational query identification, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
Huaiyu Zhu , Sriram Raghavan , Shivakumar Vaithyanathan , Alexander Löser, Navigating the intranet with high precision, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
Bin Cui , Ling Liu , Calton Pu , Jialie Shen , Kian-Lee Tan, QueST: querying music databases by acoustic and textual features, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
Xiubo Geng , Tie-Yan Liu , Tao Qin , Andrew Arnold , Hang Li , Heung-Yeung Shum, Query dependent ranking using K-nearest neighbor, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bernard J. Jansen , Danielle L. Booth , Amanda Spink, Determining the informational, navigational, and transactional intent of Web queries, Information Processing and Management: an International Journal, v.44 n.3, p.1251-1266, May, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Dou Shen , Min Qin , Weizhu Chen , Qiang Yang , Zheng Chen, Mining web query hierarchies from clickthrough data, Proceedings of the 22nd national conference on Artificial intelligence, p.341-346, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|
|
Ruihua Song , Zhenxiao Luo , Jian-Yun Nie , Yong Yu , Hsiao-Wuen Hon, Identification of ambiguous queries in web search, Information Processing and Management: an International Journal, v.45 n.2, p.216-229, March, 2009
|
|
|
|
|
|
Huanhuan Cao , Derek Hao Hu , Dou Shen , Daxin Jiang , Jian-Tao Sun , Enhong Chen , Qiang Yang, Context-aware query classification, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|