ACM Home Page
Please provide us with feedback. Feedback
Sparse higher order conditional random fields for improved sequence labeling
Full text PdfPdf (861 KB)
Source ACM International Conference Proceeding Series; Vol. 382 archive
Proceedings of the 26th Annual International Conference on Machine Learning table of contents
Montreal, Quebec, Canada
Pages 849-856  
Year of Publication: 2009
ISBN:978-1-60558-516-1
Authors
Xian Qian  Fudan University, Shanghai, P.R.China
Xiaoqian Jiang  Carnegie Mellon University, Pittsburgh, PA
Qi Zhang  Fudan University, Shanghai, P.R.China
Xuanjing Huang  Fudan University, Shanghai, P.R.China
Lide Wu  Fudan University, Shanghai, P.R.China
Sponsors
: MITACS
: NSF
Microsoft Research : Microsoft Research
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 27,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1553374.1553483
What is a DOI?

ABSTRACT

In real sequence labeling tasks, statistics of many higher order features are not sufficient due to the training data sparseness, very few of them are useful. We describe Sparse Higher Order Conditional Random Fields (SHO-CRFs), which are able to handle local features and sparse higher order features together using a novel tractable exact inference algorithm. Our main insight is that states and transitions with same potential functions can be grouped together, and inference is performed on the grouped states and transitions. Though the complexity is not polynomial, SHO-CRFs are still efficient in practice because of the feature sparseness. Experimental results on optical character recognition and Chinese organization name recognition show that with the same higher order feature set, SHO-CRFs significantly outperform previous approaches.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
Jin, G., & Chen, X. (2008). The fourth international chinese language processing bakeoff: Chinese word segmentation, named entity recognition and chinese pos tagging. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 69--81).
 
5
6
 
7
Sarawagi, S., & Cohen, W. (2004). Semi-markov conditional random fields for information extraction. Advances in Neural Information Processing Systems (pp. 1185--1192).
 
8
Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin markov networks. Advances in Neural Information Processing Systems (pp. 25--32).
 
9
Yang, F., Zhao, J., & Zou, B. (2008). CRFs-based named entity recognition incorporated with heuristic entity list searching. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 171--174).
 
10
Yu, X., Lam, W., Chan, S.-K., Wu, Y., & Chen, B. (2008). Chinese NER using CRFs and logic for the fourth sighan bakeoff. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 102--105).

Collaborative Colleagues:
Xian Qian: colleagues
Xiaoqian Jiang: colleagues
Qi Zhang: colleagues
Xuanjing Huang: colleagues
Lide Wu: colleagues