ACM Home Page
Please provide us with feedback. Feedback
Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model
Full text PdfPdf (301 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Research papers table of contents
Pages 587-595  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Issei Sato  The University of Tokyo, Tokyo, Japan
Minoru Yoshida  The University of Tokyo, Tokyo, Japan
Hiroshi Nakagawa  The University of Tokyo, Tokyo, Japan
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 366,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1401962
What is a DOI?

ABSTRACT

We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering semantic knowledge about words in particular domains, which has become increasingly important with the growing use of text mining, information retrieval, and speech recognition. The subject-predicate structure is taken as a syntactic structure with the noun as the subject and the verb as the predicate. This structure is regarded as a graph structure. The generation of this graph can be modeled using the hierarchical Dirichlet process and the Pitman-Yor process. The probabilistic generative model we developed for this graph structure consists of subject-predicate structures extracted from a corpus. Evaluation of this model by measuring the performance of graph clustering based on WordNet similarities demonstrated that it outperforms other baseline models.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
D. Aldous. Exchangeability and related topics. Lecture Notes in Math, 1117, 1985.
 
2
Antoniak. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The Annals of Statistics, 2(6), 1974.
 
3
C. Biemann. A random text model for the generation of statistical language invariants. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, 2007.
 
4
 
5
B. Dorow, D. Widdows, K. Ling, J.-P. Eckmann, D. Sergi, and E. Moses. Using curvature and markov clustering in graphs for lexical acquisition and word sense discrimination. In MEANING, 2005.
 
6
Escobar and West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 1995.
 
7
Ferguson. A bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2), 1973.
 
8
D. Gfeller, J.-C. Chappelier, and P. D. L. Rios. Synonym dictionary improvement through markov clustering and clustering stability. In In Proceedings of Applied Stochastic Models and Data Analysis, 2005.
 
9
 
10
C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda. Learning systems of concepts with an infinite relational model. In In Proceedings of the 21st National Conference on Artificial Intelligence, 2006.
 
11
 
12
S. McDonald and M. Ramscar. Testing the distributional hypothesis: The influence of context on judgements of semantic similarity. In In Proceedings of the 23rd Annual Conference of the Cognitive Science Society, 2001.
 
13
M. E. J. Newman and E. A. Leicht. Mixture models and exploratory analysis in networks. In In Proceedings of National Academy of Sciences of the United States of America, 2007.
 
14
J. Pitman and M. Yor. The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Annals of Probability, 25, 1997.
 
15
Sethuraman. A constructive definition of dirichlet priors. Statistica Sinica, 4, 1994.
 
16
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566--1581, 2006.
 
17
S. van Dongen. Graph clustering by flow simulation. PhD thesis, 2000.

Collaborative Colleagues:
Issei Sato: colleagues
Minoru Yoshida: colleagues
Hiroshi Nakagawa: colleagues