| Sparse higher order conditional random fields for improved sequence labeling |
| Full text |
Pdf
(861 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 849-856
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
Xian Qian
|
Fudan University, Shanghai, P.R.China
|
|
Xiaoqian Jiang
|
Carnegie Mellon University, Pittsburgh, PA
|
|
Qi Zhang
|
Fudan University, Shanghai, P.R.China
|
|
Xuanjing Huang
|
Fudan University, Shanghai, P.R.China
|
|
Lide Wu
|
Fudan University, Shanghai, P.R.China
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 27, Citation Count: 0
|
|
|
ABSTRACT
In real sequence labeling tasks, statistics of many higher order features are not sufficient due to the training data sparseness, very few of them are useful. We describe Sparse Higher Order Conditional Random Fields (SHO-CRFs), which are able to handle local features and sparse higher order features together using a novel tractable exact inference algorithm. Our main insight is that states and transitions with same potential functions can be grouped together, and inference is performed on the grouped states and transitions. Though the complexity is not polynomial, SHO-CRFs are still efficient in practice because of the feature sparseness. Experimental results on optical character recognition and Chinese organization name recognition show that with the same higher order feature set, SHO-CRFs significantly outperform previous approaches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
Jin, G., & Chen, X. (2008). The fourth international chinese language processing bakeoff: Chinese word segmentation, named entity recognition and chinese pos tagging. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 69--81).
|
| |
5
|
|
 |
6
|
|
| |
7
|
Sarawagi, S., & Cohen, W. (2004). Semi-markov conditional random fields for information extraction. Advances in Neural Information Processing Systems (pp. 1185--1192).
|
| |
8
|
Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin markov networks. Advances in Neural Information Processing Systems (pp. 25--32).
|
| |
9
|
Yang, F., Zhao, J., & Zou, B. (2008). CRFs-based named entity recognition incorporated with heuristic entity list searching. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 171--174).
|
| |
10
|
Yu, X., Lam, W., Chan, S.-K., Wu, Y., & Chen, B. (2008). Chinese NER using CRFs and logic for the fourth sighan bakeoff. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 102--105).
|
|