|
ABSTRACT
Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, YouTube), between which a particular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be applied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query -- not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search engine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: representing the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that express a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy questions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0.74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
R. Bhagat and D. Ravichandran. Large scale acquisition of paraphrases for learning surface patterns. In Proc. of ACL'08: HLT, pages 674--682, 2008.
|
| |
4
|
E. Bicici and D. Yuret. Clustering word pairs to answer analogy questions. In Proc. of TAINN'06, 2006.
|
| |
5
|
D. Bollegala, Y. Matsuo, and M. Ishizuka. Www sits the sat: Measuring relational similarity on the web. In Proc. of ECAI'08, pages 333--337, 2008.
|
| |
6
|
R. C. Bunescu and R. Mooney. Learning to extract relations from the web using minimal supervision. In Proc. of ACL'07, pages 576--583, 2007.
|
| |
7
|
P. Cimiano and J. Wenderoth. Automatic acquisition of ranked qualia structures from the web. In Proc. of ACL'07, pages 888--895, 2007.
|
| |
8
|
|
| |
9
|
D. Davidov and A. Rappoport. Classification of semantic relationships between nominals using pattern clusters. In Proc. of the ACL'08, 2008.
|
| |
10
|
D. Davidov and A. Rappoport. Unsupervised discovery of generic relationships using pattern clusters and its evaluation by automatically generated sat analogy questions. In Proc. of ACL'08-HLT, pages 692--700, 2008.
|
| |
11
|
J. V. Davis and I. S. Dhillon. Differential entropic clustering of multivariate gaussians. In Proc. of NIPS'06, pages 337--344, 2006.
|
 |
12
|
|
 |
13
|
Jason V. Davis , Brian Kulis , Prateek Jain , Suvrit Sra , Inderjit S. Dhillon, Information-theoretic metric learning, Proceedings of the 24th international conference on Machine learning, p.209-216, June 20-24, 2007, Corvalis, Oregon
[doi> 10.1145/1273496.1273523]
|
| |
14
|
Oren Etzioni , Michael Cafarella , Doug Downey , Ana-Maria Popescu , Tal Shaked , Stephen Soderland , Daniel S. Weld , Alexander Yates, Unsupervised named-entity extraction from the web: an experimental study, Artificial Intelligence, v.165 n.1, p.91-134, June 2005
[doi> 10.1016/j.artint.2005.03.001]
|
| |
15
|
|
| |
16
|
Z. Harris. Distributional structure. Word, 10:146--162, 1954.
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
| |
20
|
P. Mangalath, J. Quesada, and W. Kintsch. Analogy-making as predictation using relational information and lsa vectors. In Proc. of Int'l Conf. on Research in Computational Linguistics, 2004.
|
| |
21
|
|
| |
22
|
D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1--28, 1991.
|
| |
23
|
G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. Introducton to wordnet: An on-line lexical database. International Journal of Lexicography, 3:238--244, 1990.
|
| |
24
|
P. Nakov and M. Hearst. Solving relational similarity problems using the web as a corpus. In Proc. of ACL'08-HLT, pages 452--460, 2008.
|
| |
25
|
M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts -- step one: the one-million fact extraction challenge. In Proc. of AAAI'06, pages 1400--1405, 2006.
|
| |
26
|
Jian Pei , Jiawei Han , Behzad Mortazavi-Asl , Jianyong Wang , Helen Pinto , Qiming Chen , Umeshwar Dayal , Mei-Chun Hsu, Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach, IEEE Transactions on Knowledge and Data Engineering, v.16 n.11, p.1424-1440, November 2004
[doi> 10.1109/TKDE.2004.77]
|
| |
27
|
J. Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers, pages 61--74, 2000.
|
| |
28
|
|
| |
29
|
|
| |
30
|
R. Snow, D. Jurafsky, and A. Ng. Learning syntactic patterns for automatic hypernym discovery. In Proc. of Advances in Neural Information Processing Systems (NIPS) 17, pages 1297--1304, 2005.
|
| |
31
|
P. Turney. Measuring semantic similarity by latent relational analysis. In Proc. of IJCAI'05, pages 1136--1141, 2005.
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
P. Turney, M. Littman, J. Bigham, and V. Shnayder. Combining independent modules to solve multiple-choice synonym and analogy problems. In Proc. of RANLP'03, pages 482--486, 2003.
|
| |
36
|
A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1997.
|
| |
37
|
T. Veale. The analogical thesaurus. In Proc. of 15th Innovative Applications of Artificial Intelligence Conference (IAAI'03), pages 137--142, 2003.
|
| |
38
|
T. Veale. Wordnet sits the sat: A knowledge-based approach to lexical analogy. In Proc. of ECAI'04, pages 606--612, 2004.
|
| |
39
|
T. Veale and M. T. Keane. The competence of structure mapping on hard analogies. In Proc. of IJCAI'03, 2003.
|
| |
40
|
|
|