Research at Facebook just made it easier to translate between languages without many translation examples. For example, from Urdu to English.
Neural Machine Translation
Neural Machine Translation (NMT) is the field concerned with using AI to translate between any language such as English and French. In 2015 researchers at the Montreal Institute of Learning Algorithms, developed new AI techniques  which allowed machine-generated translations to finally work. Almost overnight, systems like Google Translate became orders of magnitude better.
While that leap was significant, it still required having sentence pairs in both languages, for example, "I like to eat" (English) and "me gusta comer" (Spanish). For translations between languages like Urdu and English without many of these pairs, translation systems failed miserably. Since then, researchers have been building systems that can translate without sentence pairings, ie: Unsupervised Neural Machine Translation (UNMT).
In the past year, researchers at Facebook, NYU, University of the Basque Country and Sorbonne Universites, made dramatic advancements which are finally enabling systems to translate without knowing that "house" means "casa" in Spanish.
Just a few days ago, Facebook AI Research (FAIR), published a paper  showing a dramatic improvement which allowed translations from languages like Urdu to English. "To give some idea of the level of advancement, an improvement of 1 BLEU point (a common metric for judging the accuracy of MT) is considered a remarkable achievement in this field; our methods showed an improvement of more than 10 BLEU points."
Why this matters
Labeled data is often the largest bottleneck in AI systems. This means we'd have to pay humans to do manual translations which can be prohibitively time consuming and expensive. The advancements this recent paper highlights can provide new ways of training systems without needing to generate this labeled data. Some examples could be, determining if there's a cat in a photo without any examples of photos labeled as "cat" or question-answer systems where the system isn't told the correct answer.
From a social sciences perspective, it could allow us to translate documents written in lost languages, or allow new devices that can translate between rare languages in real-time, for example, Swahili and Belarusian.
We could also imagine abstracting this idea to translate between arbitrary domains. For example, "translate" between neural activity in the brain to videos on a screen, or performance of a stock given some news event, to projected performance of another stock given a similar news event.
How it works
Here I explain how the system works without getting into the nitty-gritty details of the math and AI principles.
Facebook's system identifies and combines 3 core components developed in previous research:
- Byte-pair encodings : Instead of giving the system whole words, they give the system the word in parts. For example, the word "hello" might be given as 4 word parts "he" "l" "l" "o". This means we could learn a translation for the word "he" without the system ever having seen the word "he".
- Language model: They train other neural networks to learn to generate sentences that "sound good" in the language. For example, this neural network might change the sentence "how is you" to "how are you".
- Back-translation : This is a trick where another neural network learns to translate backward. For example, if you want to translate from Spanish to English, here we'd teach a neural network to translate from English to Spanish and use it to generate synthetic data, thereby increasing the amount of data we have.
The rest of the system combines the above techniques through two approaches, a neural network-based system (NMT) , and a phrase-based system (PBSMT) . While either approach improves translation qualities, using both creates the new impressive results.
Learning to close a cup
The version of PBSMT used in this paper is one developed previously at FAIR . This system learns a probability distribution for the phrases in each language and teaches another system to rotate the data points in the second set to match that of the first.
Example: Imagine having two images, one where a cup and a lid are next to each other and one where the lid is on the cup. The system would learn how to move the pixels around the image without a lid to generate the image with a lid.
Nice parts of this research
FAIR researchers did an amazing job at making this work accessible.
- This nice post has a slightly more technical description of the research.
- Facebook also opened access to the code for free, allowing anyone to build these systems.
- Finally, the authors did a great job with an Ablation study which looked at the effect of removing each component of the system to look at the final results. This is often overlooked in research papers but provides great insights to us as researchers about what part of the new system is the source of these improvements.
 Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).
 Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato. "Phrase-Based & Neural Unsupervised Machine Translation." arXiv preprint arXiv:1804.07755 (2018).
 Sennrich, Rico, Barry Haddow, and Alexandra Birch. "Neural machine translation of rare words with subword units." arXiv preprint arXiv:1508.07909 (2015).
 Sennrich, Rico, Barry Haddow, and Alexandra Birch. "Improving neural machine translation models with monolingual data." arXiv preprint arXiv:1511.06709 (2015).
 Conneau, Alexis, et al. "Word translation without parallel data." arXiv preprint arXiv:1710.04087 (2017).