Robust Translation of French Live Speech Transcripts

Despite a narrowed performance gap with direct approaches, cascade solutions, involving automatic speech recognition (ASR) and machine translation (MT) are still largely employed in speech translation (ST).

Direct approaches employing a single model to translate the input speech signal suffer from the critical bottleneck of data scarcity.

In addition, multiple industry applications display speech transcripts alongside translations, making cascade approaches more realistic and practical. In the context of cascaded simultaneous ST, we propose several solutions to adapt a neural MT network to take as input the transcripts output by an ASR system.

Adaptation is achieved by enriching speech transcripts and MT data sets so that they more closely resemble each other, thereby improving the system robustness to error propagation and enhancing result legibility for humans.

We address aspects such as sentence boundaries, capitalisation, punctuation, hesitations, repetitions, homophones, etc. while taking into account the low latency requirement of simultaneous ST systems.

Elise Bertin-Lemée, Guillaume Klein, Josep Crego, Jean Senellart

Robust Translation of French Live Speech Transcripts [PDF]

[Video in French] How do IA and neuronal machine translation work together to translate text?

[Vidéo in french] How does SYSTRAN create translation model with in-domain specialization?

A New Approach to Software Localization with Lingoport and SYSTRAN

I subscribe to the DailyMT newsletter