ACM Home Page
Please provide us with feedback. Feedback
Applying co-training methods to statistical parsing
Full text Publisher SitePublisher Site PdfPdf (117 KB)
Source North American Chapter Of The Association For Computational Linguistics archive
Second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies 2001 table of contents
Pittsburgh, Pennsylvania
Pages: 1 - 8  
Year of Publication: 2001
Author
Anoop Sarkar  University of Pennsylvania, Philadelphia, PA
Publisher
Association for Computational Linguistics  Morristown, NJ, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 15,   Citation Count: 24
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.3115/1073336.1073359

Warning: The download time has expired please click on the item to try again.


ABSTRACT

We propose a novel Co-Training method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labels the entire data set with parse trees. Using empirical results based on parsing the Wall Street Journal corpus we show that training a statistical parser on the combined labeled and unlabeled data strongly out-performs training only on the labeled data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
E. Brill. 1997. Unsupervised learning of disambiguation rules for part of speech tagging. In Natural Language Processing Using Very Large Corpora. Kluwer Academic Press.
 
4
G. Carroll and M. Rooth. 1998. Valence Induction with a Head-Lexicalized PCFG. http://xxx.lanl.gov/abs/cmp-lg/9805001, May.
 
5
 
6
M. Collins and Y. Singer. 1999. Unsupervised Models for Named Entity Classification. In Proc. of WVLC/EMNLP-99, pages 100--110.
 
7
 
8
 
9
 
10
 
11
 
12
A. K. Joshi and Y. Schabes. 1992. Tree-adjoining grammar and lexicalized grammars. In M. Nivat and A. Podelski, editors, Tree automata and languages, pages 409--431. Elsevier Science.
 
13
A. K. Joshi, L. Levy, and M. Takahashi. 1975. Tree Adjunct Grammars. Journal of Computer and System Sciences.
 
14
A. K. Joshi. 1985. Tree Adjoining Grammars: How much context Sensitivity is required to provide a reasonable structural description. In D. Dowty, I. Karttunen, and A. Zwicky, editors, Natural Language Parsing, pages 206--250. Cambridge University Press, Cambridge, U.K.
 
15
J. Lafferty, D. Sleator, and D. Temperley. 1992. Grammatical trigrams: A probabilistic model of link grammar. In Proc. of the AAAI Conf. on Probabilistic Approaches to Natural Language.
 
16
K. Lari and S. J. Young. 1990. The estimation of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language, 4:35--56.
 
17
C. de Marcken. 1995. Lexical heads, phrase structure and the induction of grammar. In D. Yarowsky and K. Church, editors, Proc. of 3rd WVLC, pages 14--26, MIT, Cambridge, MA.
 
18
 
19
20
 
21
 
22
 
23
A. Ratnaparkhi. 1996. A Maximum Entropy Part-Of-Speech Tagger. In Proc. of EMNLP-96, University of Pennsylvania.
 
24
 
25
 
26
B. Srinivas. 1997. Complexity of Lexical Descriptions and its Relevance to Partial Parsing. Ph.D. thesis, Department of Computer and Information Sciences, University of Pennsylvania.
 
27
 
28

CITED BY  24