The 6 Greatest NMT Breakthroughs

Over the past few years, Neural Machine Translation (NMT) has worked through many core restraints that hold it back from mainstream research and adoption across the translation space. In 2016, Google’s Neural Machine Translation system (GNMT) promised to bypass issues with computational requirements. Using low-precision arithmetic during inference computation and subdivision of common words, GNMT worked to increase throughput and accuracy for rare word computations. Today, GMNT is a core component of Google Translate.

The 6 Greatest NMT Breakthroughs

NMT has rocketed to the top of translation technology. Salesforce recently launched a new project on Github that uses XML tags to increase the accuracy and throughput of NMT solutions. Recently, Microsoft launched Custom Translator Version 2, which uses NMT to improve translation capabilities. And execs at both Amazon and Google recently had a conversation with Venture Beat to discuss the emerging trends in translation, most of which center around neural networks.

We’ve come a long way in a short amount of time. SYSTRAN Group President — Jean Senellart — recently had a panel at SlatorCon 2019, where he discussed some of the up-and-coming R&D happening in the NMT space. For many years, Jean has been involved in NMT, dating back to the open-source NMT project: OpenNMT. During this time, Jean has been involved in various NMT projects and watched the entire industry go from a future-tech held back by computational needs to tech with widespread adoption in high tech.

So, what’s on the horizon for NMT? What R&D projects are coming? And what does NMT look like in the current tech space?

Domain Specialization 

Perhaps the single biggest drawback to MT, in general, is its limitations understanding jargon and unique terminology right out of the box. With domain specializations built by SYSTRAN, though, users can adapt translations to understand industry-specific meanings behind commonly used words. For example, the word “pin” carries a different meaning in the financial, fashion, and medical worlds. With a “financial” specialized domain, the engine will automatically understand the context to match with the rest of the content. Domain specialization is the backbone of efficient and accurate NMT without burdensome training periods.

MT and Translation Memory (TM)

MT and Translation Memory (TM) 

Senellart touched on the MT and TM combo, where MT leverages TM in the same way a translator would during their work. He mentioned one paper to be published this summer that implements in-domain, on-the-fly translation by using TM. Senellart said they had done something similar, called micro-adaptation, at SYSTRAN.

MT and Post-editing 

Given that NMT has proven a bit more useful than statistical MT, post-editing will continue to be a rising trend. While noting the advent of neural post-editing with no human intervention involved, Senellart explained a new post-editing approach that uses post-edited TM to retrain the NMT model with updated data dynamically. In this case, the post editor’s TM data (already corrected and “annotated”) feeds back into the retraining NMT model.

“Where is the human in the loop?” Senellart asked the SlatorCon audience. The human in this loop is both the post-editor and data annotator.

Low-resource Languages

Low-resource Languages

There are hundreds of languages used online for which training data for MT models are sparse, and high-quality corpora are sparser still. Most Asian languages, for instance, are low-resource. Senellart noted an impressive, fully unsupervised NMT model researched by Facebook that managed to increase output quality nearly three times in just 18 months of development.

“This field is one of the most promising,” Senellart said, adding that Facebook is now working on self-supervised models as well.

Beyond the Sentence

Another challenge facing researchers is adding external context to NMT output, translating beyond the sentence. “Today, every NMT system in the world is sentence-based. We know that’s not enough. If you are to translate a document, you need to know what the document is talking about,” Senellart said. “You need to make connections between sentences. If there are pronouns, you need the correct anaphorization between sentences, for instance.”

He mentioned two approaches to beyond-sentence NMT currently gaining steam: one where the NMT model refers to a previously translated sentence for context; another, from Unbabel, which uses essential keywords found in the entire source document to inform the translation output.

Multilingual Translation

Senellart also explained how multilingual translation systems, such as zero-shot translation, can translate between 100 languages through a single model.

“One single model to translate all languages simultaneously,” is how Senellart explained multilingual translation in a nutshell. “Training multilingual models is very exciting because it’s close to unsupervised learning in that each language helps others. When you are translating English to French, then you’re helping out Spanish, for example, because there are many similarities. The model is discovering what is similar between languages and learning more general translation rules.”

NMT multilingual translation

Final Thoughts

From lexical micro-adaptation (where on-the-fly micro-sets of training instances are retrieved and paired with human translations) for MT/TM facilitation to beyond-sentence and self-supervised models, NMT is on the verge of multiple massive breakthroughs. At SYSTRAN, we’re continually pushing the boundaries of NMT to create more fluid, accurate, and trustworthy neural translation models. Are you looking to use NMT to expedite customer service, maximize eDiscovery and digital forensic throughput, or increase governance and compliance? From Industry 4.0 to government-driven solutions, SYSTRAN can help you leverage NMT to bring tangible value to your organization. Contact us to learn more.