|
ABSTRACT
Statistical phrase-based machine translation models crucially rely on word alignments. The search for word-alignments assumes a model of word locality between source and target languages that is violated in starkly different word-order languages such as English-Hindi. In this article, we present models that decouple the steps of lexical selection and lexical reordering with the aim of minimizing the role of word-alignment in machine translation. Indian languages are morphologically rich and have relatively free-word order where the grammatical role of content words is largely determined by their case markers and not just by their positions in the sentence. Hence, lexical selection plays a far greater role than lexical reordering. For lexical selection, we investigate models that take the entire source sentence into account and evaluate their performance for English-Hindi translation in a tourism domain.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Allauzen, C., Mohri, M., Riley, M., and Roark, B. 2004. A generalized construction of speech recognition transducers. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’04). 761--764.
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
Goffin, V., Allauzen, C., Bocchieri, E., Hakkani-Tur, D., Ljolje, A., Parthasarathy, S., Rahim, M., Riccardi, G., and Saraclar, M. 2005. The AT&T WATSON Speech Recognizer. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP’05).
|
| |
9
|
Haffner, P. 2006. Scaling large margin classifiers for spoken language understanding. Speech Comm. 48, iv, 239--261.
|
| |
10
|
|
| |
11
|
Ittycheriah, A. and Roukos, S. 2007. Direct translation model 2. In Proceedings of the Human Language Technologies Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NACCL’07). 57--64.
|
| |
12
|
Kanthak, S., Vilar, D., Matusov, E., Zens, R., and Ney, H. 2005. Novel reordering approaches in phrase-based statistical machine translation. In Proceedings of the ACL Workshop on Building and Using Parallel Texts (ACL’05). 167--174.
|
| |
13
|
Koehn, P. and Hoang, H. 2007. Factored translation models. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 868--876.
|
| |
14
|
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the Association for Computational Linguistics (ACL’07).
|
| |
15
|
|
| |
16
|
|
| |
17
|
Och, F., Tillmann, C., and Ney, H. 1999. Improved alignment models for statistical machine translation. In Proceedings of the Joint Conference of Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP-VLC’99). 20--28.
|
| |
18
|
Ben Taskar , Simon Lacoste-Julien , Dan Klein, A discriminative matching approach to word alignment, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.73-80, October 06-08, 2005, Vancouver, British Columbia, Canada
[doi> 10.3115/1220575.1220585]
|
| |
19
|
|
| |
20
|
|
|