| Mining product reputations on the Web |
| Full text |
Pdf
(813 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Edmonton, Alberta, Canada
SESSION: Industry track papers
table of contents
Pages: 341 - 349
Year of Publication: 2002
ISBN:1-58113-567-X
|
|
Authors
|
|
Satoshi Morinaga
|
NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, JAPAN
|
|
Kenji Yamanishi
|
NEC Corporation, 4-1-1, Miyazaki, Miyamae, Kawasaki, Kanagawa, 216-8555, JAPAN
|
|
Kenji Tateishi
|
NEC Corporation, 8916-47, Takayama-cho, Ikoma, Nara, 630-0101, JAPAN
|
|
Toshikazu Fukushima
|
NEC Corporation, 8916-47, Takayama-cho, Ikoma, Nara, 630-0101, JAPAN
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 30, Downloads (12 Months): 237, Citation Count: 35
|
|
|
ABSTRACT
Knowing the reputations of your own and/or competitors' products is important for marketing and customer relationship management. It is, however, very costly to collect and analyze survey data manually. This paper presents a new framework for mining product reputations on the Internet. It automatically collects people's opinions about target products from Web pages, and it uses text mining techniques to obtain the reputations of those products.On the basis of human-test samples, we generate in advance syntactic and linguistic rules to determine whether any given statement is an opinion or not, as well as whether such any opinion is positive or negative in nature. We first collect statements regarding target products using a general search engine, and then, using the rules, extract opinions from among them and attach three labels to each opinion, labels indicating the positive/negative determination, the product name itself, and an numerical value expressing the degree of system confidence that the statement is, in fact, an opinion. The labeled opinions are then input into an opinion database.The mining of reputations, i.e., the finding of statistically meaningful information included in the database, is then conducted. We specify target categories using label values (such as positive opinions of product A) and perform four types of text mining: extraction of 1) characteristic words, 2) co-occurrence words, 3) typical sentences, for individual target categories, and 4) correspondence analysis among multiple target categories.Actual marketing data is used to demonstrate the validity and effectiveness of the framework, which offers a drastic reduction in the overall cost of reputation analysis over that of conventional survey approaches and supports the discovery of knowledge from the pool of opinions on the web.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
M.R. Anderberg, Cluster Analysis for Applications, Academic Press, 1973.
|
 |
4
|
|
| |
5
|
J.P. Benzecri, Correspondence Analysis Handbook, Mercel Dekker, 1992.
|
| |
6
|
V. Chaudhri and R. Fikes, Answering Systems, the 1999 Fall Symposium. Technical Report, FS-98-04, AAAI, November 1999.
|
| |
7
|
|
| |
8
|
Mark Craven , Dan DiPasquo , Dayne Freitag , Andrew McCallum , Tom Mitchell , Kamal Nigam , Seán Slattery, Learning to construct knowledge bases from the World Wide Web, Artificial Intelligence, v.118 n.1-2, p.69-113, April 2000
[doi> 10.1016/S0004-3702(00)00004-7]
|
 |
9
|
Robert B. Doorenbos , Oren Etzioni , Daniel S. Weld, A scalable comparison-shopping agent for the World-Wide Web, Proceedings of the first international conference on Autonomous agents, p.39-48, February 05-08, 1997, Marina del Rey, California, United States
[doi> 10.1145/267658.267666]
|
 |
10
|
|
| |
11
|
Fujitsu, Symfoware World http://www.fujitsu.co.jp/jp/soft/symfoware/index.html, 2001.
|
| |
12
|
|
| |
13
|
B. Katz, From sentence processing to information access on the World Wide Web. in Natural Language Processing for the World Wide Web: the 1997 AAAI Spring Symposium, pp:77--94, 1999.
|
| |
14
|
Komatsu Soft, Information Mining Tool VextSearch (in Japanese) http://www.komatsusoft.co.jp/develp/vxtsc/index.html, 2001.
|
 |
15
|
|
| |
16
|
|
| |
17
|
K.C. Litkowski, Question-answering using semantic relation triples.in Proc. of the 8th Text Retrieval Conference (TREC-8)., pp:349--356, 1999.
|
| |
18
|
Dan Moldovan , Sanda Harabagiu , Marius Pasca , Rada Mihalcea , Roxana Girju , Richard Goodrum , Vasile Rus, The structure and performance of an open-domain question answering system, Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, p.563-570, October 03-06, 2000, Hong Kong
[doi> 10.3115/1075218.1075289]
|
 |
19
|
John Prager , Eric Brown , Anni Coden , Dragomir Radev, Question-answering by predictive annotation, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.184-191, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345574]
|
| |
20
|
|
| |
21
|
J. Rissanen, Fisher information and stochastic complexity, IEEE Transaction on Information Theory, 42(1), pp:40--47, 1996.
|
| |
22
|
D. R. Radev, J. Prager, and V. Samn, The use of predictive annotation for question answering in Proc. of the 8th Text Retrieval Conference (TREC-8), pp:399--411, 1999.
|
| |
23
|
R. Srihari and W. Li, Information extraction supported question answering, in Proc. of the 8th Text Retrieval Conference (TREC-8), pp:185--196, 1999.
|
| |
24
|
K. Tateishi, Y. Ishiguro, and T. Fukushima, A reputation search engine that gathers people's opinions from the internet, (in Japanese) Technical Report NL-144-11, Information Processing Society of Japan, pp:75--82, 2001.
|
 |
25
|
|
| |
26
|
|
| |
27
|
K. Yamanishi, A decision-theoretic extension of stochastic complexity and its applications to learning, IEEE Trans. on Infortmation Theory, 44(4), pp:1424--1439, 1998.
|
CITED BY 36
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Satoshi Morinaga , Hiroki Arimura , Takahiro Ikeda , Yosuke Sakao , Susumu Akamine, Key semantics extraction by dependency tree mining, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christopher Scaffidi , Kevin Bierhoff , Eric Chang , Mikhael Felker , Herman Ng , Chun Jin, Red Opal: product-feature scoring from reviews, Proceedings of the 8th ACM conference on Electronic commerce, June 11-15, 2007, San Diego, California, USA
|
|
|
Yejin Choi , Claire Cardie , Ellen Riloff , Siddharth Patwardhan, Identifying sources of opinions with conditional random fields and extraction patterns, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.355-362, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|