Efficient and High-Quality Neural Machine Translation with OpenNMT [PDF [en anglais]

This paper describes the OpenNMT submissions to the WNGT 2020 efficiency shared task. We explore training and acceleration of Transformer models with various sizes that are trained in a teacher-student setup. We also present a custom and optimized C++ inference engine that enables fast CPU and GPU decoding with few dependencies. By combining additional optimizations and parallelization techniques, we create small, efficient, and high-quality neural machine translation models.

Guillaume Klein, Dakun Zhang, Clément Chouteau, Josep Crego, Jean Senellart


Ouvrage : « Proceedings of the Fourth Workshop on Neural Generation and Translation », pages 211-217, Association for Computational Linguistics, juillet 2020

