| Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model |
| Full text |
Pdf
(301 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Las Vegas, Nevada, USA
SESSION: Research papers
table of contents
Pages 587-595
Year of Publication: 2008
ISBN:978-1-60558-193-4
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 23, Downloads (12 Months): 366, Citation Count: 0
|
|
|
ABSTRACT
We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering semantic knowledge about words in particular domains, which has become increasingly important with the growing use of text mining, information retrieval, and speech recognition. The subject-predicate structure is taken as a syntactic structure with the noun as the subject and the verb as the predicate. This structure is regarded as a graph structure. The generation of this graph can be modeled using the hierarchical Dirichlet process and the Pitman-Yor process. The probabilistic generative model we developed for this graph structure consists of subject-predicate structures extracted from a corpus. Evaluation of this model by measuring the performance of graph clustering based on WordNet similarities demonstrated that it outperforms other baseline models.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
D. Aldous. Exchangeability and related topics. Lecture Notes in Math, 1117, 1985.
|
| |
2
|
Antoniak. Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The Annals of Statistics, 2(6), 1974.
|
| |
3
|
C. Biemann. A random text model for the generation of statistical language invariants. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, 2007.
|
| |
4
|
|
| |
5
|
B. Dorow, D. Widdows, K. Ling, J.-P. Eckmann, D. Sergi, and E. Moses. Using curvature and markov clustering in graphs for lexical acquisition and word sense discrimination. In MEANING, 2005.
|
| |
6
|
Escobar and West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 1995.
|
| |
7
|
Ferguson. A bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2), 1973.
|
| |
8
|
D. Gfeller, J.-C. Chappelier, and P. D. L. Rios. Synonym dictionary improvement through markov clustering and clustering stability. In In Proceedings of Applied Stochastic Models and Data Analysis, 2005.
|
| |
9
|
|
| |
10
|
C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, and N. Ueda. Learning systems of concepts with an infinite relational model. In In Proceedings of the 21st National Conference on Artificial Intelligence, 2006.
|
| |
11
|
|
| |
12
|
S. McDonald and M. Ramscar. Testing the distributional hypothesis: The influence of context on judgements of semantic similarity. In In Proceedings of the 23rd Annual Conference of the Cognitive Science Society, 2001.
|
| |
13
|
M. E. J. Newman and E. A. Leicht. Mixture models and exploratory analysis in networks. In In Proceedings of National Academy of Sciences of the United States of America, 2007.
|
| |
14
|
J. Pitman and M. Yor. The two-parameter poisson-dirichlet distribution derived from a stable subordinator. Annals of Probability, 25, 1997.
|
| |
15
|
Sethuraman. A constructive definition of dirichlet priors. Statistica Sinica, 4, 1994.
|
| |
16
|
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566--1581, 2006.
|
| |
17
|
S. van Dongen. Graph clustering by flow simulation. PhD thesis, 2000.
|
|