Our webinar “Get More From SPNS9” on May 15th, 2020 was a huge success. The webinar demonstrated 6 new exciting upgrades to the SYSTRAN Pure Neural Server 9.6’s, further scaling its technological capabilities. Thank you to those who joined us.
In this post, we have compiled the highlights from the presentation and answers to the questions we receive after.
The minds behind SYSTRAN sit down for an interview regarding the complexities and the capacities of specialized neural machine translation engines.
Participants: Peter Zoldan, Senior Data Engineer -Software Engineer Linguistic Program, Svetlana Zyrianova, Linguistic Program, Petra Bayrami, Jr. Software Engineer – Linguistic Program, Natalia Segal, R&D Engineer.
How much data is required to create a specialized engine?
The more bilingual data, the better the quality. For broad domains such as news, millions of bilingual sentences will be required. However, if the domain is narrow, such as technical support documents for certain products, then even a small set of sentences of 50,000, noticeably improves the quality.
The amount of data required will depend on how broad or narrow the demand you are specializing the engine into.
Language is messy. Ask any person who has ever had to learn a second language and they will tell you that the most difficult aspect isn’t learning all the rules, but understanding the exceptions to the rules — the real-world application of the language.