ACM Home Page
Please provide us with feedback. Feedback
Distributional clustering of English words
Full text Publisher SitePublisher Site PdfPdf (757 KB)
Source Annual Meeting of the ACL archive
Proceedings of the 31st annual meeting on Association for Computational Linguistics table of contents
Columbus, Ohio
Pages: 183 - 190  
Year of Publication: 1993
Authors
Fernando Pereira  AT&T Bell Laboratories, Murray Hill, NJ
Naftali Tishby  Hebrew University, Jerusalem, Israel
Lillian Lee  Cornell University, Ithaca, NY
Publisher
Association for Computational Linguistics  Morristown, NJ, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 96,   Citation Count: 155
Additional Information:

abstract   references   cited by   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
DOI Bookmark: 10.3115/981574.981598

ABSTRACT

We describe and evaluate experimentally a method for clustering words according to their distribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the similarity measure for clustering. Clusters are represented by average context distributions derived from the given words according to their probabilities of cluster membership. In many cases, the clusters can be thought of as encoding coarse sense distinctions. Deterministic annealing is used to find lowest distortion sets of clusters: as the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Peter F. Brown, Vincent J. Della Pietra, Peter V. deSouza, Jenifer C. Lai, and Robert L. Mercer. 1990. Class-based n-gram models of natural language. In Proceedings of the IBM Natural Language ITL, pages 283--298, Paris, France, March.
 
2
Kenneth W. Church and William A. Gale. 1991. A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language, 5:19--54.
 
3
 
4
 
5
 
6
A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1--38.
 
7
Richard O. Duda and Peter E. Hart. 1973. Pattern Classification and Scene Analysis. Wiley-Interscience, New York, New York.
 
8
 
9
Donald Hindle. 1993. A parser for text corpora. In B.T.S. Atkins and A. Zampoli, editors, Computational Approaches to the Lexicon. Oxford University Press, Oxford, England. To appear.
 
10
Edwin T. Jaynes. 1983. Brandeis lectures. In Roger D. Rosenkrantz, editor, E. T. Jaynes: Papers on Probability, Statistics and Statistical Physics, number 158 in Synthese Library, chapter 4, pages 40--76. D. Reidel, Dordrecht, Holland.
 
11
Philip Resnik. 1992. WordNet and distributional analysis: A class-based approach to lexical discovery. In AAAI Workshop on Statistically-Based Natural-Language-Processing Techniques, San Jose, California, July.
 
12
Kenneth Rose, Eitan Gurewitz, and Geoffrey C. Fox. 1990. Statistical mechanics and phase transitions in clustering. Physical Review Letters, 65(8):945--948.
 
13
 
14
David Yarowsky. 1992. CONC: Tools for text corpora. Technical Memorandum 11222-921222-29, AT&T Bell Laboratories.

CITED BY  158
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Collaborative Colleagues:
Fernando Pereira: colleagues
Naftali Tishby: colleagues
Lillian Lee: colleagues

Peer to Peer - Readers of this Article have also read: