In the late 1980s, Canadian master’s student Yoshua Bengio became captivated by an unfashionable idea. A handful of artificial intelligence researchers was trying to craft software that loosely mimicked how networks of neurons process data in the brain, despite scant evidence it would work. “I fell in love with the idea that we could both understand the principles of how the brain works and also construct AI,” says Bengio, now a professor at the University of Montreal.
More than 20 years later, the tech industry fell in love with that idea, too. Neural networks are behind the recent bloom of progress in AI that has enabled projects such as self-driving cars and phone bots practically indistinguishable from people.
Wednesday, Bengio, 55, and two other protagonists of that revolution won the highest honor in computer science, the ACM Turing Award, known as the Nobel Prize of computing. The other winners are Google researcher Geoff Hinton, 71, and NYU professor and Facebook’s chief AI scientist Yann LeCun, 58, who wrote some of the papers that seduced Bengio into working on neural networks.
The trio’s journey is a parable of scientific grit and case study in the economic value of new forms of computing. Through decades of careful research out of the limelight, they transformed an old-fashioned, marginalized idea into the hottest in computer science. The technology they championed is central to every large tech company’s strategy for the future. It’s how software in testing at Google reads medical scans, Tesla’s Autopilot reads road markings, and Facebook automatically removes some hate speech.
Asked what winning the Turing Award means, Hinton expresses mock surprise. “I guess neural networks are now respectable computer science,” he says. The joke is that in computer science, there isn’t anything more respectable than a Turing Award. It has been awarded annually since 1966, and is named after Alan Turing, the British mathematician who laid early foundations for computing and AI in the 1930s, 40s, and 50s.
Pedros Domingos, a professor at the University of Washington who leads machine learning research at hedge fund DE Shaw, says it’s beyond time that deep learning was recognized. “This was long overdue,” he says. Domingos’ 2015 book The Master Algorithm surveyed five “tribes” taking different approaches to AI, including the “connectionists” working on neural networks.
Awarding the Turing to that tribe acknowledges a shift in how computer scientists solve problems, he says. “This is not just a Turing Award for these particular people. It’s recognition that machine learning has become a central field in computer science,” says Domingos.
The discipline has a long tradition of valuing mathematically proven solutions for problems. Machine learning algorithms get things done in a messier way, following statistical trails in data to find methods that work well in practice even if it’s not clear exactly how. “Computer science is a form of engineering and what matters is whether you get results,” Domingos says.
Neural networks are one of the oldest approaches to artificial intelligence, becoming established at the field’s beginnings in the late 1950s. Researchers adapted simple models of brain cells created by neuroscientists into mathematical networks that could learn to sort data into categories by filtering it through a series of “neurons.”
Early successes included the room-filling Perceptron, which could learn to distinguish shapes on a screen. But it was unclear how to train large networks with many layers of neurons, to allow the technique to go beyond toy tasks.
Hinton showed the solution to training so-called deep networks. He coauthored a seminal 1986 paper on a learning algorithm called backpropagation. That algorithm, known as backprop, is at the heart of deep learning today, but back then the technology wouldn’t quite come together. “There was a blackout period between the mid-90s and the mid-2000s where essentially nobody but a few crazy people like us were working on neural nets,” says LeCun.
The WIRED Guide to Artificial Intelligence
His contributions included convnets, invented neural network designs well suited to images; he proved the concept by creating check-reading software for ATMs at Bell Labs. Bengio pioneered methods to apply deep learning to sequences, such as speech, and understanding text. But the wider world only caught on to deep learning early in this decade, after researchers figured out how to harness the power of graphics processors, or GPUs.
One crucial moment took place in 2012, when Hinton, then at the University of Toronto, and two grad students surprisingly won an annual contest for software that identifies objects in photos. Their triumph left the field’s favored methods in the dust, correctly sorting more than 100,000 photos into 1,000 categories within five guesses with 85 percent accuracy, more than 10 percentage points better than the runner-up. Google acquired a startup founded by the trio early in 2013 and Hinton has worked for the company ever since. Facebook hired LeCun later that year.
“You can look back on what happened and think science worked the way it's meant to work,” Hinton says. That is, “until we could produce results that were clearly better than the current state of the art, people were very skeptical.”
Hinton says he and his collaborators stuck with their unfashionable ideas for so long because they are mavericks at heart. All three are now part of the academic and tech industry mainstream. Hinton and LeCun are vice presidents at two of the world’s most influential companies. Bengio has not joined a tech giant, but is an adviser to Microsoft and has worked with startups adapting deep learning to tasks such as drug discovery and helping victims of sexual harassment.
The three have gone in different directions, but remain collaborators and friends. Asked whether they will deliver the traditional Turing Award lecture together, Hinton raises chuckles by suggesting Bengio and LeCun go first so he can give his own lecture about what they got wrong. Does that joke reflect the trio’s typical working dynamic? Hinton says “no” at the same time LeCun good naturedly says “yes.”
Despite deep learning’s many practical successes, there’s still much it can’t do. Neural networks are brain-inspired but not much like the brain. The intelligence deep learning gives computers can be exceptional at narrowly defined tasks—play this particular game, recognize these particular sounds—but isn’t adaptable and versatile like human intelligence.
Hinton and LeCun say they would like to end the dependence of today’s systems on explicit and extensive training by people. Deep learning projects depend on an abundant supply of data labeled to explain the task at hand—a major limitation in areas such as medicine. Bengio highlights how despite successes such as better translation tools, the technology is far from able to properly understand language.
None of the trio claim to know how to solve those remaining challenges. They advise anyone hoping to make the next Turing-winning breakthrough in AI to emulate their own willingness to ignore mainstream ideas. “They should not follow the trend—which right now is deep learning,” Bengio says.