|
ABSTRACT
This paper proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigger model---originally developed for incorporating distant word dependencies---in order to model the characteristics of personal opinions that cannot be properly modeled by standard n-grams. Our primary assumption is that there are two constituents to form a subjective opinion. One is the subject of the opinion or the object that the opinion is about, and the other is a subjective expression; the former is regarded as a triggering word and the latter as a triggered word. We automatically identify those subjective trigger patterns to build a language model from a corpus of product customer reviews. Experimental results on the TREC Blog Track test collections show that, when used for reranking initial search results, our proposed model significantly improves opinionated document retrieval by over 20% in MAP. In addition, we report on an experiment on dynamic adaptation of the model to a given query, which is found effective for most of difficult queries categorized under politics and organizations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Nitin Agarwal , Huan Liu , Lei Tang , Philip S. Yu, Identifying the influential bloggers in a community, Proceedings of the international conference on Web search and web data mining, February 11-12, 2008, Palo Alto, California, USA
[doi> 10.1145/1341531.1341559]
|
 |
3
|
|
 |
4
|
|
| |
5
|
A. Esuli and F. Sebastiani. PageRanking WordNet synsets: An application to opinion mining. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, 2007.
|
| |
6
|
D. Hannah, C. Macdonald, J. Peng, B. He, and I. Ounis. University of Glasgow at TREC 2007: Experiments in blog and enterprise tracks with Terrier. In Proceesings of the 16th Text Retrieval Conference, 2007.
|
 |
7
|
|
| |
8
|
|
| |
9
|
R. Lau, R. Rosenfeld, and S. Roukos. Trigger-based language models: a maximum entropy approach. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 2, pages 45--48, 1993.
|
 |
10
|
|
 |
11
|
|
| |
12
|
C. Macdonald, I. Ounis, and I. Soboroff. Overview of the TREC-2007 blog track. In Proceesings of the 16th Text Retrieval Conference, 2007.
|
| |
13
|
|
 |
14
|
Qiaozhu Mei , Xu Ling , Matthew Wondra , Hang Su , ChengXiang Zhai, Topic sentiment mixture: modeling facets and opinions in weblogs, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242596]
|
| |
15
|
|
| |
16
|
G. Mishne. Multiple ranking strategies for opinion retrieval in blogs. In Proceesings of the 15th Text Retrieval Conference, 2006.
|
| |
17
|
D. Oard, T. Elsayed, J. Wang, Y. Wu, P. Zhang, E. Abels, J. Lin, and D. Soergel. TREC-2006 at Maryland: Blog, enterprise, legal and QA tracks. In Proceesings of the 15th Text Retrieval Conference, 2006.
|
| |
18
|
I. Ounis, M. de Rijke, C. Macdonald, G. Mishne, and I. Soboroff. Overview of the TREC-2006 blog track. In Proceesings of the 15th Text Retrieval Conference, 2006.
|
 |
19
|
|
| |
20
|
|
| |
21
|
K. Seki, Y. Kino, S. Sato, and K. Uehara. TREC 2007 blog track experiments at Kobe University. In Proceesings of the 16th Text Retrieval Conference, 2007.
|
 |
22
|
|
| |
23
|
K. Sparck Jones. Statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11--20, 1972.
|
 |
24
|
|
| |
25
|
|
| |
26
|
O. Vechtomova. Using subjective adjectives in opinion retrieval from blogs. In Proceesings of the 16th Text Retrieval Conference, 2007.
|
| |
27
|
K. Yang, N. Yu, A. Valerio, and H. Zhang. WIDIT in trec-2006 blog track. In Proceesings of the 15th Text Retrieval Conference, 2006.
|
| |
28
|
K. Yang, N. Yu, and H. Zhang. WIDIT in TREC 2007 blog track: Combining lexicon-based methods to detect opinionated blogs. In Proceesings of the 16th Text Retrieval Conference, 2007.
|
 |
29
|
|
| |
30
|
W. Zhang and C. Yu. UIC at TREC 2006 blog track. In Proceesings of the 15th Text Retrieval Conference, 2006.
|
 |
31
|
|
| |
32
|
G. Zhou, H. Joshi, and C. Bayrak. Topic categorization for relevancy and opinion detection. In Proceesings of the 16th Text Retrieval Conference, 2007.
|
|