|
ABSTRACT
Combining the output from multiple retrieval sources over the same document collection is of great importance to a number of retrieval tasks such as multimedia retrieval, web retrieval and meta-search. To merge retrieval sources adaptively according to query topics, we propose a series of new approaches called probabilistic latent query analysis (pLQA), which can associate non-identical combination weights with latent classes underlying the query space. Compared with previous query independent and query-class based combination methods, the proposed approaches have the advantage of being able to discover latent query classes automatically without using prior human knowledge, to assign one query to a mixture of query classes, and to determine the number of query classes under a model selection principle. Experimental results on two retrieval tasks, i.e., multimedia retrieval and meta-search, demonstrate that the proposed methods can uncover sensible latent classes from training data, and can achieve considerable performance gains.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
T. S. Chua, S. Y. Neo, K. Li, G. H. Wang, R. Shi, M. Zhao, H. Xu, S. Gao, and T. L. Nwe. Trecvid 2004 search and feature extraction task by NUS PRIS. In NIST TRECVID, 2004.
|
| |
2
|
T. Coleman and Y. Li. An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization, 6:418--445, 1996.
|
| |
3
|
A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1--38, 1977.
|
| |
4
|
|
| |
5
|
A. Hauptmann, M.-Y. Chen, M. Christel, C. Huang, W.-H. Lin, T. Ng, N. Papernick, A. Velivelli, J. Yang, R. Yan, H. Yang, and H. D. Wactlar. Confounded expectations: Informedia at trecvid 2004. In Proc. of TRECVID, 2004.
|
 |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
G. Kimeldorf and G. Wahba. Some results on tchebycheffian spline functions. J. Math. Anal. Applic., 33:82--95, 1971.
|
| |
10
|
R. Manmatha, F. Feng, and T. Rath. Using models of score distributions in information retrieval. In Proc. of the 27th ACM SIGIR Conference on Research and Development in Information Retrieval, 2001.
|
 |
11
|
|
 |
12
|
|
| |
13
|
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society for Informaiton Science, 27, 1977.
|
| |
14
|
G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6(2):461--464, 1978.
|
| |
15
|
J. A. Shaw and E. A. Fox. Combination of multiple searches. In Text REtrieval Conference, 1994.
|
| |
16
|
A. Smeaton and P. Over. TRECVID: Benchmarking the effectiveness of information retrieval tasks on digital video. In Proc. of the Intl. Conf. on Image and Video Retrieval, 2003.
|
 |
17
|
Ellen M. Voorhees , Narendra K. Gupta , Ben Johnson-Laird, Learning collection fusion strategies, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.172-179, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215357]
|
| |
18
|
E. M. Voorhees and D. Harman. Overview of the eighth text retrieval conference (trec-8). In TREC, 1999.
|
| |
19
|
R. Yan and A. G. Hauptmann. Efficient margin-based rank learning algorithms for information retrieval. In International Conference on Image and Video Retrieval(CIVR), 2006.
|
 |
20
|
|
 |
21
|
|
CITED BY 5
|
|
|
|
|
|
|
|
Raymond K. Pon , Alfonso F. Cardenas , David Buttler , Terence Critchlow, Tracking multiple topics for finding interesting articles, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
|
|
|
|