|
ABSTRACT
We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Ulrich Germann , Michael Jahr , Kevin Knight , Daniel Marcu , Kenji Yamada, Fast decoding and optimal decoding for machine translation, Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, p.228-235, July 06-11, 2001, Toulouse, France
[doi> 10.3115/1073012.1073042]
|
| |
4
|
Imamura, K. (2002). Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based mt. In Proceedings of TMI.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
Och, F. J., Tillmann, C., and Ney, H. (1999). Improved alignment models for statistical machine translation. In Proc. of the Joint Conf. of Empirical Methods in Natural Language Processing and Very Large Corpora, pages 20--28.
|
| |
9
|
|
| |
10
|
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2001). BLEU: a method for automatic evaluation of machine translation. Technical Report RC22176(W0109-022), IBM Research Report.
|
| |
11
|
|
| |
12
|
Seymore, K. and Rosenfeld, R. (1997). Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech.
|
| |
13
|
|
| |
14
|
|
CITED BY 84
|
|
|
|
|
|
|
|
|
|
|
Masaaki Nagata , Kuniko Saito , Kazuhide Yamamoto , Kazuteru Ohashi, A clustered global phrase reordering model for statistical machine translation, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, p.713-720, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
Simon Lacoste-Julien , Ben Taskar , Dan Klein , Michael I. Jordan, Word alignment via quadratic assignment, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.112-119, June 04-09, 2006, New York, New York
|
|
|
|
|
|
|
|
|
Richard Zens , Hermann Ney , Taro Watanabe , Eiichiro Sumita, Reordering constraints for phrase-based statistical machine translation, Proceedings of the 20th international conference on Computational Linguistics, p.205-es, August 23-27, 2004, Geneva, Switzerland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bing Zhao , Niyu Ge , Kishore Papineni, Inner-outer bracket models for word alignment using hidden blocks, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.177-184, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Percy Liang , Alexandre Bouchard-Côté , Dan Klein , Ben Taskar, An end-to-end discriminative approach to machine translation, Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, p.761-768, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
David Chiang , Adam Lopez , Nitin Madnani , Christof Monz , Philip Resnik , Michael Subotin, The Hiero machine translation system: extensions, evaluation, and analysis, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.779-786, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
Roland Kuhn , Denis Yuen , Michel Simard , Patrick Paul , George Foster , Eric Joanis , Howard Johnson, Segment choice models: feature-rich models for global distortion in statistical machine translation, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.25-32, June 04-09, 2006, New York, New York
|
|
|
|
|
|
David Pinto , Jorge Civera , Alberto Barrón-Cedeòo , Alfons Juan , Paolo Rosso, A statistical approach to crosslingual natural language tasks, Journal of Algorithms, v.64 n.1, p.51-60, January, 2009
|
|
|
|
|
|
|
|
|
David Vickrey , Luke Biewald , Marc Teyssier , Daphne Koller, Word-sense disambiguation for machine translation, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.771-778, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Necip Fazil Ayan , Bonnie J. Dorr , Christof Monz, Alignment link projection using transformation-based learning, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.185-192, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Necip Fazil Ayan , Bonnie J. Dorr , Christof Monz, NeurAlign: combining word alignments using neural networks, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.65-72, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
AiTi Aw , Min Zhang , Juan Xiao , Jian Su, A phrase-based statistical model for SMS text normalization, Proceedings of the COLING/ACL on Main conference poster sessions, p.33-40, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hichem Sahbi , Jean-Yves Audibert , Jaonary Rabarisoa , Renaud Keriven, Robust matching and recognition using context-dependent kernels, Proceedings of the 25th international conference on Machine learning, p.856-863, July 05-09, 2008, Helsinki, Finland
|
|
|
|
|
|
|
|
|
|
|
|
|
José B. Mariòo , Rafael E. Banchs , Josep M. Crego , Adrià de Gispert , Patrik Lambert , José A. R. Fonollosa , Marta R. Costa-jussà, N-gram-based Machine Translation, Computational Linguistics, v.32 n.4, p.527-549, December 2006
|
|
|
R. San-Segundo , R. Barra , R. Córdoba , L. F. D'Haro , F. Fernández , J. Ferreiros , J. M. Lucas , J. Macías-Guarasa , J. M. Montero , J. M. Pardo, Speech to sign language translation system for Spanish, Speech Communication, v.50 n.11-12, p.1009-1020, November, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sergio Barrachina , Oliver Bender , Francisco Casacuberta , Jorge Civera , Elsa Cubel , Shahram Khadivi , Antonio Lagarda , Hermann Ney , Jesús Tomás , Enrique Vidal , Juan-Miguel Vilar, Statistical approaches to computer-assisted translation, Computational Linguistics, v.35 n.1, p.3-28, March 2009
|
|
|
|
|
|