| Exploiting web search to generate synonyms for entities |
| Full text |
Pdf
(1.24 MB)
|
Source
|
International World Wide Web Conference
archive
Proceedings of the 18th international conference on World wide web
table of contents
Madrid, Spain
SESSION: Data mining/session: web mining
table of contents
Pages 151-160
Year of Publication: 2009
ISBN:978-1-60558-487-4
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 40, Downloads (12 Months): 206, Citation Count: 1
|
|
|
ABSTRACT
Tasks recognizing named entities such as products, people names, or locations from documents have recently received significant attention in the literature. Many solutions to these tasks assume the existence of reference entity tables. An important challenge that needs to be addressed in the entity extraction task is that of ascertaining whether or not a candidate string approximately matches with a named entity in a given reference table. Prior approaches have relied on string-based similarity which only compare a candidate string and an entity it matches with. In this paper, we exploit web search engines in order to define new similarity functions. We then develop efficient techniques to facilitate approximate matching in the context of our proposed similarity functions. In an extensive experimental evaluation, we demonstrate the accuracy and efficiency of our techniques.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
P. D. Turney. Mining the web for synonyms: Pmi-ir versus lsa on toefl. CoRR, cs.LG/0212033, 2002.
|
CITED BY
|
|
Sanjay Agrawal , Kaushik Chakrabarti , Surajit Chaudhuri , Venkatesh Ganti , Arnd Christian Konig , Dong Xin, Exploiting web search engines to search structured databases, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|