ACM Home Page
Please provide us with feedback. Feedback
Measuring the similarity between implicit semantic relations using web search engines
Full text PdfPdf (452 KB)
Source Web Search and Web Data Mining archive
Proceedings of the Second ACM International Conference on Web Search and Data Mining table of contents
Barcelona, Spain
SESSION: Web mining I table of contents
Pages 104-113  
Year of Publication: 2009
ISBN:978-1-60558-390-7
Authors
Danushka Bollegala  The University of Tokyo, Tokyo, Japan
Yutaka Matsuo  The University of Tokyo, Tokyo, Japan
Mitsuru Ishizuka  The University of Tokyo, Tokyo, Japan
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
: Google
SIGIR: ACM Special Interest Group on Information Retrieval
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
: Yahoo! Research
Microsoft : Microsoft
: Nokia
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 26,   Downloads (12 Months): 321,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1498759.1498815
What is a DOI?

ABSTRACT

Measuring the similarity between implicit semantic relations is an important task in information retrieval and natural language processing. For example, consider the situation where you know an entity-pair (e.g. Google, YouTube), between which a particular relation holds (e.g. acquisition), and you are interested in retrieving other entity-pairs for which the same relation holds (e.g. Yahoo, Inktomi). Existing keyword-based search engines cannot be directly applied in this case because in keyword-based search, the goal is to retrieve documents that are relevant to the words used in the query -- not necessarily to the relations implied by a pair of words. Accurate measurement of relational similarity is an important step in numerous natural language processing tasks such as identification of word analogies, and classification of noun-modifier pairs. We propose a method that uses Web search engines to efficiently compute the relational similarity between two pairs of words. Our method consists of three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different semantic relations implied by them, and measuring the similarity between different semantic relations using an inter-cluster correlation matrix. We propose a pattern extraction algorithm to extract a large number of lexical patterns that express numerous semantic relations. We then present an efficient clustering algorithm to cluster the extracted lexical patterns. Finally, we measure the relational similarity between word-pairs using inter-cluster correlation. We evaluate the proposed method in a relation classification task. Experimental results on a dataset covering multiple relation types show a statistically significant improvement over the current state-of-the-art relational similarity measures.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
R. Bhagat and D. Ravichandran. Large scale acquisition of paraphrases for learning surface patterns. In Proc. of ACL'08: HLT, pages 674--682, 2008.
 
4
R. C. Bunescu and R. Mooney. Learning to extract relations from the web using minimal supervision. In Proc. of ACL'07, pages 576--583, 2007.
 
5
P. Cimiano and J. Wenderoth. Automatic acquisition of ranked qualia structures from the web. In Proc. of ACL'07, pages 888--895, 2007.
 
6
7
 
8
D. Davidov and A. Rappoport. Classification of semantic relationships between nominals using pattern clusters. In Proc. of the ACL'08, 2008.
 
9
D. Davidov and A. Rappoport. Unsupervised discovery of generic relationships using pattern clusters and its evaluation by automatically generated sat analogy questions. In Proc. of ACL'08-HLT, pages 692--700, 2008.
 
10
 
11
Z. Harris. Distributional structure. Word, 10:146--162, 1954.
 
12
 
13
14
 
15
 
16
D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1--28, 1991.
 
17
G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introducton to wordnet: An on-line lexical database. International Journal of Lexicography, 3:238--244, 1990.
 
18
P. Nakov and M. Hearst. Solving relational similarity problems using the web as a corpus. In Proc. of ACL'08-HLT, pages 452--460, 2008.
 
19
M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge. In Proc. of AAAI'06, pages 1400--1405, 2006.
 
20
 
21
 
22
 
23
M. Schultz and T. Joachims. Learning a distance metric from relative comparisons. In Proc. of NIPS'03, 2003.
 
24
 
25
R. Snow, D. Jurafsky, and A. Ng. Learning syntactic patterns for automatic hypernym discovery. In Proc. of Advances in Neural Information Processing Systems (NIPS) 17, pages 1297--1304, 2005.
 
26
P. Turney. Measuring semantic similarity by latent relational analysis. In Proc. of IJCAI'05, pages 1136--1141, 2005.
 
27
 
28
 
29
 
30
P. Turney, M. Littman, J. Bigham, and V. Shnayder. Combining independent modules to solve multiple-choice synonym and analogy problems. In Proc. of RANLP'03, pages 482--486, 2003.
 
31
A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1997.
 
32
T. Veale. The analogical thesaurus. In Proc. of 15th Innovative Applications of Artificial Intelligence Conference (IAAI'03), pages 137--142, 2003.
 
33
T. Veale. Wordnet sits the sat: A knowledge-based approach to lexical analogy. In Proc. of ECAI'04, pages 606--612, 2004.
 
34
T. Veale and M. T. Keane. The competence of structure mapping on hard analogies. In Proc. of IJCAI'03, 2003.
 
35
K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. In Proc. of NIPS'05, pages 1473--1480, 2005.
 
36

Collaborative Colleagues:
Danushka Bollegala: colleagues
Yutaka Matsuo: colleagues
Mitsuru Ishizuka: colleagues