ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Semi-supervised conditional random fields for improved sequence segmentation and labeling
Full text Publisher SitePublisher Site PdfPdf (176 KB)
Source Annual Meeting of the ACL archive
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics table of contents
Sydney, Australia
Pages: 209 - 216  
Year of Publication: 2006
Authors
Feng Jiao  University of Waterloo
Shaojun Wang  University of Alberta
Chi-Hoon Lee  University of Alberta
Russell Greiner  University of Alberta
Dale Schuurmans  University of Alberta
Publisher
Association for Computational Linguistics  Morristown, NJ, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 42,   Citation Count: 6
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.3115/1220175.1220202

ABSTRACT

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Y. Altun, D. McAllester and M. Belkin. (2005). Maximum margin semi-supervised learning for structured variables. Advances in Neural Information Processing Systems 18.
3
 
4
 
5
V. Castelli and T. Cover. (1996). The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Trans. on Information Theory, 42(6):2102--2117.
 
6
 
7
I. Cohen and F. Cozman. (2006). Risks of semi-supervised learning. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien, (Editors), 55--70, MIT Press.
 
8
A. Corduneanu and T. Jaakkola. (2006). Data dependent regularization. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien, (Editors), 163--182, MIT Press.
 
9
 
10
R. Duda and P. Hart. (1973). Pattern Classification and Scene Analysis, John Wiley & Sons.
 
11
Y. Grandvalet and Y. Bengio. (2004). Semi-supervised learning by entropy minimization, Advances in Neural Information Processing Systems, 17:529--536.
 
12
 
13
W. Li and A. McCallum. (2005). Semi-supervised sequence modeling with syntactic topic models. Proceedings of Twentieth National Conference on Artificial Intelligence, 813--818.
 
14
A. McCallum. (2002). MALLET: A machine learning for language toolkit. {http://mallet.cs.umass.edu}
 
15
R. McDonald, K. Lerman and Y. Jin. (2005). Conditional random field biomedical entity tagger. {http://www.seas.upenn.edu/~sryantm/software/BioTagger/}
 
16
R. McDonald and F. Pereira. (2005). Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics 2005, 6(Suppl 1):S6.
 
17
 
18
J. Nocedal and S. Wright. (2000). Numerical Optimization, Springer.
 
19
S. Roberts, R. Everson and I. Rezek. (2000). Maximum certainty data partitioning. Pattern Recognition, 33(5):833--839.
20
 
21
 
22
D. Zhou, O. Bousquet, T. Navin Lal, J. Weston and B. Schölkopf. (2004). Learning with local and global consistency. Advances in Neural Information Processing Systems, 16:321--328.
23
 
24
X. Zhu, Z. Ghahramani and J. Lafferty. (2003). Semisupervised learning using Gaussian fields and harmonic functions. Proceedings of the 20th International Conference on Machine Learning, 912--919.

CITED BY  6
Collaborative Colleagues:
Feng Jiao: colleagues
Shaojun Wang: colleagues
Chi-Hoon Lee: colleagues
Russell Greiner: colleagues
Dale Schuurmans: colleagues