|
ABSTRACT
Discriminative sequential learning models like Conditional Random Fields (CRFs) have achieved significant success in several areas such as natural language processing or information extraction. Their key advantage is the ability to capture various nonindependent and overlapping features of inputs. However, several unexpected pitfalls have a negative influence on the model's performance; these mainly come from a high imbalance among classes, irregular phenomena, and potential ambiguity in the training data. This article presents a data-driven approach that can deal with such difficult data instances by discovering and emphasizing important conjunctions or associations of statistics hidden in the training data. Discovered associations are then incorporated into these models to deal with difficult data instances. Experimental results of phrase-chunking and named entity recognition using CRFs show a significant improvement in accuracy. In addition to the technical perspective, our approach also highlights a potential connection between association mining and statistical learning by offering an alternative strategy to enhance learning performance with interesting and useful patterns discovered from large datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Altun, Y., Hofmann, T., and Johnson, M. 2002. Discriminative learning for label sequences via boosting. In Proceedings of Neural Information Processing Systems (NIPS).
|
| |
3
|
|
| |
4
|
|
| |
5
|
Carreras, X. and Marquez, L. 2003. Phrase recognition by filtering and ranking with perceptrons. In Proceedings of the Recent Advances in Natural Language Processing (RANLP). 205--216.
|
| |
6
|
Chen, S. F. and Rosenfeld, R. 1999. A gaussian prior for smoothing maximum entropy models. Tech. Rep. CMU-CS-99-108. Carnegie Mellon University.
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
 |
10
|
Thomas G. Dietterich , Adam Ashenfelter , Yaroslav Bulatov, Training conditional random fields via gradient tree boosting, Proceedings of the twenty-first international conference on Machine learning, p.28, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015428]
|
| |
11
|
Radu Florian , Abe Ittycheriah , Hongyan Jing , Tong Zhang, Named entity recognition through classifier combination, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, p.168-171, May 31, 2003, Edmonton, Canada
[doi> 10.3115/1119176.1119201]
|
| |
12
|
|
| |
13
|
Dan Klein , Joseph Smarr , Huy Nguyen , Christopher D. Manning, Named entity recognition with character-level models, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, p.180-183, May 31, 2003, Edmonton, Canada
[doi> 10.3115/1119176.1119204]
|
| |
14
|
Kristjansson, T., Culotta, A., Viola, P., and McCallum, A. 2004. Interactive information extraction with constrained conditional random fields. In Proceedings of the 19th National Conference on Artificial Intelligence (AAAI). 412--418.
|
| |
15
|
|
| |
16
|
|
 |
17
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
18
|
|
| |
19
|
He, X., Zemel, R. S., and Carreira-Perpinan, M. A. 2004. Multiscale conditional random fields for image labeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 695--702.
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
Liu, B., Hsu, W., and Ma, Y. 1998. Integrating classification and association rule mining. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 80--86.
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
McCallum, A. 2003. Efficiently inducing features of conditional random fields. In Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence (UAI). 403--410.
|
| |
28
|
Padmanabhan, B. and Tuzhilin, A. 1998. A belief-driven method for discovering unexpected patterns. In Proceedings of International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 94--100.
|
| |
29
|
|
| |
30
|
|
 |
31
|
|
 |
32
|
Xuan-Hieu Phan , Le-Minh Nguyen , Tu-Bao Ho , Susumu Horiguchi, Improving discriminative sequential learning with rare--but--important associations, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081906]
|
| |
33
|
Rabiner, L. R. 1989. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of IEEE 77, 2, 257--286.
|
| |
34
|
Ratnaparkhi, A. 1996. A maximum entropy model for part-of-speech tagging. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
|
| |
35
|
|
| |
36
|
|
| |
37
|
Suzuki, E. 1997. Autonomous discovery of reliable exception rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 259--262.
|
| |
38
|
Suzuki, E. and Shimura, M. 1996. Exceptional knowledge discovery in databases based on information theory. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD). 295--298.
|
| |
39
|
Torralba, A., Murphy, K. P., and Freeman, W. T. 2004. Contextual models for object detection using boosted random fields. In Proceedings of the Conference on Neural Information Processing Systems (NIPS).
|
 |
40
|
|
| |
41
|
|
|