|
ABSTRACT
Sequence segmentation is a flexible and highly accurate mechanism for modeling several applications. Inference on segmentation models involves dynamic programming computations that in the worst case can be cubic in the length of a sequence. In contrast, typical sequence labeling models require linear time. We remove this limitation of segmentation models vis-a-vis sequential models by designing a succinct representation of potentials common across overlapping segments. We exploit such potentials to design efficient inference algorithms that are both analytically shown to have a lower complexity and empirically found to be comparable to sequential models for typical extraction tasks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bartlett, P. L., Collins, M., Taskar, B., & McAllester, D. (2005). Exponentiated gradient algorithms for large-margin structured classification. In L. K. Saul, Y. Weiss and L. Bottou (Eds.), Advances in neural information processing systems 17, 113--120. Cambridge, MA: MIT Press.
|
| |
2
|
Borthwick, A., Sterling, J., Agichtein, E., & Grishman, R. (1998). Exploiting diverse knowledge sources via maximum entropy in named entity recognition. Sixth Workshop on Very Large Corpora New Brunswick, New Jersey. Association for Computational Linguistics.
|
| |
3
|
Cohen, W. W., Ravikumar, P., & Fienberg, S. E. (2003). A comparison of string distance metrics for name-matching tasks. Proceedings of the IJCAI-2003 Workshop on Information Integration on the Web (IIWeb-03). To appear.
|
 |
4
|
|
| |
5
|
Keshet, J., Shalev-Shwartz, S., & Singer, Y. (2005). Phoneme alignment using large margin techniques. Workshop on the Advances in Structured Learning for Text and Speech Processing, NIPS.
|
| |
6
|
McDonald, R., Crammer, K., & Pereira, F. (2005). Flexible text segmentation with structured multilabel classification. Human Language Technology Conference Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP).
|
| |
7
|
Peng, F., & McCallum, A. (2004). Accurate information extraction from research papers using conditional random fields. HLT-NAACL (pp. 329--336).
|
| |
8
|
Sarawagi, S., & Cohen, W. W. (2004). Semi-markov conditional random fields for information extraction. NIPS.
|
| |
9
|
Ioannis Tsochantaridis , Thorsten Joachims , Thomas Hofmann , Yasemin Altun, Large Margin Methods for Structured and Interdependent Output Variables, The Journal of Machine Learning Research, 6, p.1453-1484, 9/1/2005
|
| |
10
|
|
CITED BY 4
|
|
|
|
|
Jun Zhu , Bo Zhang , Zaiqing Nie , Ji-Rong Wen , Hsiao-Wuen Hon, Webpage understanding: an integrated approach, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
|
|
|
|