|
ABSTRACT
Information retrieval is, in general, an iterative search process, in which the user often has several interactions with a retrieval system for an information need. The retrieval system can actively probe a user with questions to clarify the information need instead of just passively responding to user queries. A basic question is thus how a retrieval system should propose questions to the user so that it can obtain maximum benefits from the feedback on these questions. In this paper, we study how a retrieval system can perform active feedback, i.e., how to choose documents for relevance feedback so that the system can learn most from the feedback information. We present a general framework for such an active feedback problem, and derive several practical algorithms as special cases. Empirical evaluation of these algorithms shows that the performance of traditional relevance feedback (presenting the top K documents) is consistently worse than that of presenting documents with more diversity. With a diversity-based selection algorithm, we obtain fewer relevant documents, however, these fewer documents have more learning benefits.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Allan. HARD track overview in TREC2003. In Proceedings of TREC 2003, 2003.
|
 |
2
|
|
| |
3
|
D. A. Cohn, Z. Ghahramani, and M. I. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research, 4:129--145, 1996.
|
 |
4
|
|
| |
5
|
T. Jaakkola and H. Siegelmann. Active information retrieval. In Proceedings of NIPS 2001, 2001.
|
 |
6
|
|
| |
7
|
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.
|
 |
8
|
|
 |
9
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
| |
10
|
D. D. Lewis. Active by accident: Relevance feedback in information retrieval. Unpublished Working Notes of 1995 AAAI Fall Symposium on Active Learning, 1995.
|
| |
11
|
D. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Proceedings of ICML 1994, 1994.
|
| |
12
|
|
| |
13
|
J. Lin. Divergence measures based on the shannon entropy. IEEE Transactions on Information Theory, 37(1):145--151, 1991.
|
| |
14
|
|
| |
15
|
S. E. Robertson, H. Zaragoza, and M. Taylor. Microsoft Cambridge at TREC-12: HARD track. In Proceedings of TREC 2003, 2003.
|
| |
16
|
J. J. Rocchio. Relevance feedback in information retrieval. The SMART Retrieval System, pages 313--323, 1971.
|
| |
17
|
|
| |
18
|
G. Salton and C. Buckley. Improving retrieval performance by retrieval feedback. Journal of the American Society for Information Science, 41(4):288--297, 1990.
|
| |
19
|
|
| |
20
|
X. Shen and C. Zhai. Active feedback--UIUC TREC2003 HARD experiments. In Proceedings of TREC 2003, 2003.
|
| |
21
|
K. Sparck Jones. Search term relevance weighting given little relevance information. Journal of Documentation, 35(1):30--48, 1979.
|
| |
22
|
|
| |
23
|
|
| |
24
|
Lemur Toolkit. http://www.cs.cmu.edu/~lemur.
|
| |
25
|
C. Zhai. Risk Minimization and Language Modeling in Text Retrieval. PhD thesis, Carnegie Mellon University, 2002.
|
 |
26
|
|
 |
27
|
|
| |
28
|
C. Zhang and T. Chen. An active learning framework for content-based information retrieval. IEEE Transactions on Multimedia, 4:260--268, 2002.
|
| |
29
|
Y. Zhang, W. Xu, and J. P. Callan. Exploration and exploitation in adaptive filtering based on Bayesian active learning. In Proceedings of ICML 2003, 2003.
|
CITED BY 13
|
|
|
|
|
Dong Xin , Hong Cheng , Xifeng Yan , Jiawei Han, Extracting redundancy-aware top-k patterns, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
Steven C. H. Hoi , Rong Jin , Jianke Zhu , Michael R. Lyu, Batch mode active learning and its application to medical image classification, Proceedings of the 23rd international conference on Machine learning, p.417-424, June 25-29, 2006, Pittsburgh, Pennsylvania
|
|
|
Dong Xin , Xuehua Shen , Qiaozhu Mei , Jiawei Han, Discovering interesting patterns through user's interactive feedback, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|