| A general optimization framework for smoothing language models on graph structures |
| Full text |
Pdf
(372 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Singapore, Singapore
SESSION: Learning models for IR
table of contents
Pages 611-618
Year of Publication: 2008
ISBN:978-1-60558-164-4
|
|
Authors
|
|
Qiaozhu Mei
|
University of Illinois at Urbana-Champaign, Urbana, IL, USA
|
|
Duo Zhang
|
University of Illinois at Urbana-Champaign, Urbana, IL, USA
|
|
ChengXiang Zhai
|
University of Illinois at Urbana-Champaign, Urbana, IL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 22, Downloads (12 Months): 268, Citation Count: 4
|
|
|
ABSTRACT
Recent work on language models for information retrieval has shown that smoothing language models is crucial for achieving good retrieval performance. Many different effective smoothing methods have been proposed, which mostly implement various heuristics to exploit corpus structures. In this paper, we propose a general and unified optimization framework for smoothing language models on graph structures. This framework not only provides a unified formulation of the existing smoothing heuristics, but also serves as a road map for systematically exploring smoothing methods for language models. We follow this road map and derive several different instantiations of the framework. Some of the instantiations lead to novel smoothing methods. Empirical results show that all such instantiations are effective with some outperforming the state of the art smoothing methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Harvard University, 1998.
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
D. Hiemstra and W. Kraaij. Twenty-one at TREC-7: Ad-hoc and cross-language track. In Proceedings of TREC 7, pages 227--238, 1998.
|
 |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
 |
13
|
|
 |
14
|
|
| |
15
|
R. Mihalcea and D. R. Radev, editors. Textgraphs: Graph-based methods for NLP, 2006.
|
 |
16
|
David R. H. Miller , Tim Leek , Richard M. Schwartz, A hidden Markov model information retrieval system, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.214-221, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312680]
|
 |
17
|
|
 |
18
|
Tao Qin , Tie-Yan Liu , Xu-Dong Zhang , Zheng Chen , Wei-Ying Ma, A study of relevance propagation for web search, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076105]
|
| |
19
|
|
| |
20
|
Tao Tao , Xuanhui Wang , Qiaozhu Mei , ChengXiang Zhai, Language model information retrieval with document expansion, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.407-414, June 04-09, 2006, New York, New York
[doi> 10.3115/1220835.1220887]
|
 |
21
|
|
 |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2004.
|
| |
26
|
D. Zhou and B. Schölkopf. Discrete regularization. Semi-supervised learning, pages 221--232, 2006.
|
| |
27
|
X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.
|
CITED BY 4
|
|
|
|
|
|
|
|
Donald Metzler , Jasmine Novak , Hang Cui , Srihari Reddy, Building enriched document representations using aggregated anchor text, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|