ACM Home Page
Please provide us with feedback. Feedback
Statistical phrase-based translation
Full text PdfPdf (130 KB)
Source North American Chapter Of The Association For Computational Linguistics archive
Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1 table of contents
Edmonton, Canada
Pages: 48 - 54  
Year of Publication: 2003
Authors
Philipp Koehn  University of Southern California
Franz Josef Och  University of Southern California
Daniel Marcu  University of Southern California
Publisher
Association for Computational Linguistics  Morristown, NJ, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 173,   Citation Count: 82
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.3115/1073445.1073462

ABSTRACT

We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
Imamura, K. (2002). Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based mt. In Proceedings of TMI.
 
5
 
6
 
7
 
8
Och, F. J., Tillmann, C., and Ney, H. (1999). Improved alignment models for statistical machine translation. In Proc. of the Joint Conf. of Empirical Methods in Natural Language Processing and Very Large Corpora, pages 20--28.
 
9
 
10
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2001). BLEU: a method for automatic evaluation of machine translation. Technical Report RC22176(W0109-022), IBM Research Report.
 
11
 
12
Seymore, K. and Rosenfeld, R. (1997). Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech.
 
13
 
14

CITED BY  84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Collaborative Colleagues:
Philipp Koehn: colleagues
Franz Josef Och: colleagues
Daniel Marcu: colleagues