On November 21, I read an interview with Yoshua Bengio in Technology Review that to a suprising degree downplayed recent successes in deep learning, emphasizing instead some other important problems in AI might require important extensions to what deep learning is currently able to do. In particular, Bengio told Technology Review that,
I think we need to consider the hard challenges of AI and not be satisfied with short-term, incremental advances. I’m not saying I want to forget deep learning. On the contrary, I want to build on it. But we need to be able to extend it to do things like reasoning, learning causality, and exploring the world in order to learn and acquire information.
I agreed with virtually every word and thought it was terrific that Bengio said so publicly. I was also struck by what seemed to be (a) an important change in view, or at least framing, relative to how advocates of deep learning framed things a few years ago (see below), (b) movement towards a direction for which I had long advocated, and (c) noteworthy coming from Bengio, who is, after all, one of the major pioneers in deep learning
So I tweeted it, expecting a few retweets and nothing more. Instead I accidentally launched a Twitterstorm, at times illuminating, at times maddening, with some of the biggest folks in the field, including Bengio’s fellow deep learning pioneer Yann LeCun and one of AI’s deepest thinkers, Judea Pearl.
Here’s the tweet, perhaps forgotten in the storm that followed:
For the record and for comparison, here’s what I had said almost exactly six years earlier, on November 25, 2012, eerily similar,
Deep learning is important work, with immediate practical applications.
Realistically, deep learning is only part of the larger challenge of building intelligent machines. Such techniques lack ways of representing causal relationships (such as between diseases and their symptoms), and are likely to face challenges in acquiring abstract ideas like “sibling” or “identical to.” They have no obvious ways of performing logical inferences, and they are also still a long way from integrating abstract knowledge, such as information about what objects are, what they are for, and how they are typically used. The most powerful A.I. systems … use techniques like deep learning as just one element in a very complicated ensemble of techniques, ranging from the statistical technique of Bayesian inference to deductive reasoning.
I stand by that — which as far as I know (and I could be wrong) is the first place where anybody said that deep learning per se wouldn’t be a panacea, and would instead need to work in a larger context to solve a certain class of problems. Bengio was pretty much saying the same thing.
Some people liked the tweet, some people didn’t. Yann LeCun’s response was deeply negative. In a series of tweets he claimed (falsely) that I hate deep learning, and that because I was not personally an algorithm developer, I had no right to speak critically; for good measure, he said that if I had finally seen the light of deep learning, it was only in the last few days, in the space of our Twitter discussion (also false).
By reflecting on what was and wasn’t said (and what does and doesn’t actually check out) in that debate, and where deep learning continues to struggle, I believe that we can learn a lot.