Trends in Neural Machine Translation

by Isabella Massardo | Sep 3, 2020

Last month, OpenAI launched the closed beta version of its GPT-3 (Generative Pre-Trained Transformer-3) to show the potential of the model. As the number of those who have access to the program is starting to grow, a selected group of investors, experts and journalists have shared the results of their experiments on social media.

The principles that guide GPT-3 are simple, at least conceptually: a machine–learning algorithm analyzes the statistical models of a trillion words extrapolated from digitized books and web discussions. The result is a fluent text, even if in the long run the software shows all its logical limits when subjected to complex reasoning–as it is often the case with this kind of software. And although some experts tested GPT-3’s ability to translate and obtained impressive results with a very small input, we’re still a far cry from the universal translator described in Murray Leinster’s First Contact or other fantastic devices found in more popular sci-fi books.

So, it might be useful to be reminded of the current state of technology in the real world and, at the same time, have an overview of where things are going. For this reason, Wordbee organized a panel with four experts to discuss what we can expect from neural machine translation in the near future.

Machine learning and neural machine translation

Machine learning (ML) is a branch of computer science that can be considered a subfield of artificial intelligence. Defining in a simple way the characteristics and applications of machine learning is not always possible, since the application field is wide, and ML works with different ways, techniques, and tools.

But the question that interests us is more specific, i.e. how is machine learning applied in computational linguistics and natural language programming?

One might say that there is no big difference between machine learning and neural machine translation (NMT). Problems like developing a machine learning model, adapting an existing one, deploying it, and making sure it delivers high quality results can also be found in the field of machine translation. On the other hand, machine translation manages unstructured data, so we need specific models that can help find the structure (patterns) in a dataset.

For many years, language service providers have tried to find the ideal use case for machine translation and make it work for customers and for themselves. Until more or less five years ago, the main discussion was centered around the productivity of machine translation and the usefulness of post-editing. After many benchmarks, academic papers, and conferences on these topics, in 2020 the discussion has finally moved forward.

Our panel experts agreed to estimate that 80% of the training data used for generic NMT is useful. As Maxim Khalilov, Head of R&D at Glovo, suggests, this means that we are on the cusp of a new era, in which machine learning is playing a new and important role in how to distinguish between good and bad translations.

Quality Estimation: A game-changer?

A new industry paradigm might emerge, with QA, QC, and QE as essential elements. By the way, if these acronyms are making your head spin, we’ve got you covered with this previous article.

When it comes to the topic of quality and machine translation in 2020, what can we expect for the next few years?

As a machine–learning technology, a quality estimation (QE) algorithm automatically assigns a quality indicator to a machine-translation output without having access to a human-generated reference translation. The technology itself has been around for a while, but only a few companies have the financial and human resources necessary to experiment with QE in a production environment. Yuka Nakasone, Intento Inc.’s Globalization and Localization Director, states that in 2020 the technology for QE of machine translation systems will be productized at scale, and we will probably see the rise of hybrid MT-QE systems.

This development could prove particularly interesting for machine translation providers. When deploying an MT system, the main factors to be reckoned with are the usual ones, time, cost, and quality. QE technology can allow tech providers to play with quality boundaries while trying to strike the right balance between cost and time.

According to Paula Reichenberg, CEO at Hieronymus, two other interesting uses of QE technology could be a) the assessment of the quality of data used to train an NMT engine and b) the detection of the best NMT engine for the translation of a specific document. This would be particularly interesting in complex and highly specialized fields like law and pharmaceuticals. Google and Microsoft are already using this QE technology: the innovation will be then making QE available to the public.

Tighter integration and adaptive systems

Samuel Läubli, CTO at TextShuttle, underlines another interesting development, i.e. the interplay between various tools, especially CAT tools, and NMT, in combination with translation memories and term bases. The current level of integration - that allows translators to post-edit the suggestions of the NMT system to which the CAT is connected through an API - will become even tighter.

Just like for statistical machine translation (SMT) in 2015, there is talk now of adaptive NMT systems. Thanks to the adaptive technology an NMT system can “learn” on the fly and improves during the post-editing. To this end, translation memories are essentials: they need to be relevant, precise and of good quality. The same goes for term bases, although terminology integration will probably remain a pain point for morphologically rich languages.

Context-aware MT

Traditionally MT systems translated phrase by phrase, and the translation of isolated units brought about some obvious limitations. The effort now is to develop document-level machine translation systems, so that, in order to translate a sentence, the MT engine will look at previous and following sentences. Google has recorded some progress in this field.

There are other potential trends that are emerging: How do you choose an NMT engine in terms of verticals and language pairs? Do you need various NMT engines to handle multilingual content? Is hyper–specialization of NMT engines for specific segments a possibility? And most importantly, how do you choose which trends to follow? It is, of course, important to stay up to date with technological developments, but each new “thing” needs to be evaluated based on the problems that your own company needs to solve, the scalability of a solution, the availability of open source code and much more.

Wordbee integrates with a variety of MT engines and is ready to assist you in adapting technological solutions into your translation workflow. Contact us for a free consultation.