Open Source, Multilingual AI and Artificial Neural Networks : The new Holy Grail for the GAFA

Since 2016, there has been a sharp increase in open source machine translation projects based on neural networks or Neural Machine Translation (NMT) led by companies such as Google, Facebook and SYSTRAN. Why have machine translation and NMT-related innovations become the new Holy Grail for tech companies? And does the future of these companies rely on machine translation?

Never before has a technological field undergone so much disruption in such a short time. Invented in the 1960s, machine translation was first based on grammatical and syntactical rules until 2007. Statistical modelling (known as statistical translation or SMT), which matured particularly due to the abundance of data, then took over. Although statistical translation was introduced by IBM in the 1990s, it took 15 years for the technology to reach mass adoption. Neural Machine Translation on the other hand, only took two years to be widely adopted by the industry after being introduced by academia in 2014, showing the acceleration of innovation in this field. Machine translation is currently experiencing a golden age of technology.

From Big Data to Good Data

Not only have these successive waves of technology differed in their pace of development and adoption, but their key strengths or “core values” have also changed. In rule-based translation, value was brought by code and accumulated linguistic resources. For statistical models, the amount of data was paramount. The more data you had, the better the quality of your translation and your evaluation via the BLEU score (Bilingual Evaluation Understudy, the most widely used algorithm measuring machine translation quality). Now, the move to Machine translation based on neural networks and Deep Learning is well underway and has brought about major changes. The engines are trained to learn language as a child does, progressing step by step. The challenge is not only to process exponential data (Big Data) but more importantly to feed the engines the most qualitative data possible. Hence the interest in “Good data.”

Continue reading

SYSTRAN Demos Two New Integrations for Relativity at Ing3nious’ SoCal E-Discovery & Information Governance Retreat

This article was originally published on PR Newswire SYSTRAN Demos Two New Integrations for Relativity at Ing3nious’ SoCal E-Discovery & Information Governance Retreat

screen-shot-2016-11-16-at-1-54-07-pmSYSTRAN, a global leader in language translation technology, demonstrated two new integrations for SYSTRAN’s offering of Relativity at Ing3nious’ SoCal E-discovery & Information Governance Retreat, November 13-14.
SYSTRAN demonstrated two new integrations, aDiscovery and Anonymizer, for SYSTRAN’s offering of Relativity at the event. The aDiscovery feature will aid in audio discovery by transcribing audio files, detecting the source language and then translating the content. Anonymizer applies rigorous anonymization techniques to the full text and metadata of electronic documents within Relativity.

20161114_171147Anonymization can be used to mask identifying details in documents such as names, addresses, identification numbers, places, amounts and so forth when reading the anonymized documents; however, anonymized documents retain sufficient information for most relevancy reviews. Users also have the ability to “pseudononymize” selected names replacing pre-identified names with chosen pseudonyms on a mass basis to provide another option for privacy protection.

By 2020, Gartner predicts that 80 percent of litigation will involve multiple languages. The features are meant to assist with multi-language or cross-border e-discovery, giving legal teams a cost-effective method to translate files efficiently.

“These new integrations are going to change the way modern day legal teams handle multi-language files during e-discovery by maximizing productivity through automatic translation and Natural Language Processing (NLP),” says Ken Behan, SYSTRAN Vice President of Sales & Marketing. “These methods are far more efficient than manual translation and will save firms time and money during the e-discovery process.”

20161114_123107

In addition to the new integrations, SYSTRAN’s offering for Relativity automatically detects languages of files, translates documents that have multiple languages, and bulk translates using the Mass Action feature in Relativity. Organizations using Relativity are also able to support their billing process by accurately reflecting the workload completed.

To learn more about SYSTRAN’s offering of Relativity visit http://www.systransoft.com/translation-products/integrations/cmless-for-relativity.

About SYSTRAN

For over 48 years, SYSTRAN transformed the way global organizations such as Apple, Adobe, Daimler, HSBC, and Symantec meet the challenges of communicating globally via advanced machine-based translation technology. With the ability to facilitate communication in over 130 languages and 20 vertical domains, SYSTRAN enables instantaneous and automatic multilingual translations for texts, emails, chat, web pages, mobile apps, documents, user-generated content and more.

For more on SYSTRAN visit www.systrangroup.com

Related Links

SYSTRAN for Relativity – More Information

Related Video

 

This article was originally published on PR Newswire  SYSTRAN Demos Two New Integrations for Relativity at Ing3nious’ SoCal E-Discovery & Information Governance Retreat

Join SYSTRAN at the 1st Hackaton on NLP @Google Paris Office on Euro 2016 thematic !

During the joint conference JEP-TALN-RECITAL 2016, will take place at Google Paris office, from July 2nd to July 4th, the first edition of a hackathon dedicated to NLP (Natural Language Processing), co-organised by research centre in “Text, Computing and Multilingualism”.

The aim is to bring the community around data and software tools to exchange, model, prototyper, coding, implement, develop, test, assess… and much more!

The tasks proposed concern the event detection and implementation of dialogue management system. The thematic selected is Euro 2016, which will bring a practical application case, data (tweets and structured data) and could also make possible real time experiences.
SYSTRAN, the leading provider of language translation technlogies, organizes the event detection session and will award a special price for the winner.

Detailed program and registration here (in French).

Hackatal July 2-4, Paris