How to identify and fix execution bugs like Spectre and Meltdown has been a burning topic among microprocessor buffs this year. At Hot Chips, one of the industry’s premier academic conferences on microprocessors, experts agreed that the ultimate solution to solving them may require, yes, a lot more talk.
At a panel Monday at the Cupertino, California event, Professor Mark Hill of the University of Wisconsin, Madison, was asked to think about the implications of side-channel, speculative execution attacks on modern microprocessors like those made by ARM, Intel, and others. His solutions included specialized cores, flushing caches on context switches, and business ideas like charging more for exclusive virtual machines.
But the real answer, he and several other panelists said, is more collaboration between hardware and software designers—and maybe a complete redesign of today’s microprocessors.
How the entire chip industry was blindsided
Meltdown and Spectre were revealed unexpectedly in late 2017, shortly before the vulnerabilities were due to be formally, quietly, disclosed during CES in January, 2018. Originally discovered by Google’s “zero-day” investigative team, Google Project Zero, the attacks take advantage of a modern property of microprocessors, speculative execution, where the processor essentially “guesses” which instruction branch to take and execute. (Paul Turner, an engineer and lead on Google’s kernel team who was on the panel, said that Project Zero didn’t give the others at Google a heads-up; they found out just like everyone else.)
What microprocessor designers thought for 20 years was that a bad “guess” simply retired the data without any security risks. They were wrong, as the side-channel attacks proved.
In practical terms, it means one browser tab could view the contents of another, or one virtual machine could peer into another. That prompted CPU vendors like Intel, along with Microsoft, to issue software “mitigations,” or patches. It’s the most effective way to protect your PC from Spectre, Meltdown, or any of the followup attacks, like Foreshadow.
Fortunately, teasing that information out takes time—in some cases, a lot of it. NetSpectre, which can exploit one of the Spectre vulnerabilities remotely, can be used to break in via the cloud or a remote machine. On one hand, the resulting data leak can be as slow as 1 bit per minute, according to panelist John Hennessy, the famous microprocessor designer and now chairman of Alphabet. On the other, the average time between when a server is remotely penetrated and when that intrusion is discovered is 100 days, he added—giving a vulnerability like Spectre lots of time to work.
Intel’s next-generation processors probably won’t totally fix the first Spectre variant, Hennessy said, even though Intel’s planned hardware mitigations will start being designed in this fall with Cascade Lake, a new Xeon processor.
Patch, or do-over?
ARM, Intel, AMD, and others in the industry can fix the problem through mitigations in the short term, Hill added. But more fundamental changes may need to be made to eliminate the problem altogether, he said.
“The long-run question is how do we define this right so that we potentially eliminate the problem,” Hill said. “Or are we forced to make it like a crime thing that we’re always mitigating.”
Speculative execution was one of the ways that the microprocessor, and by extension, the PC industry, achieved record sales, noted panelist Jon Masters, a computer architect at Red Hat. But speculation was treated as a “magic black box,” he said, without proper questioning by users or customers. That genie’s out of the box, too. Removing speculation and the processor caches that they leverage would lower performance by twenty-fold, Hill said.
Hill’s suggested solutions included isolating the branch prediction element, adding randomization, and implementing better hardware protections. Adding slower, safer execution modes by turning off speculation could be one solution; another would be to split an execution engine between “fast cores” and “safe cores.” He also suggested business solutions including charging more for virtual machines—instead of sharing hardware resources with more than one VM, a cloud provider could provide exclusive access. Finally, Hill noted that Spectre-style attacks could also lead to resurgence of accelerators: fixed-purpose logic that is optimized for a single task, and doesn't rely on speculation.
The fundamental solution to the problem, though, would be a ground-up reworking of the architectural definition, Hill said. A computer architecture is the way in which a processor executes the software instruction set, with arithmetic units, floating-point units, and more—and today’s chips were designed to conform to the needs of the original model. But if the basic architectural model is fundamentally flawed, he said, it may be time for a new one. In other words, Spectre and Meltdown aren’t bugs—just flaws in the design of all modern chips—and a new model may be needed.
What the panel ultimately decided upon, though, was the simple truth that hardware needs to be designed with software in mind, and vice versa—and both sides need to become more versed in security.
“What often happens is that hardware designers go and build some great hardware, and then we stop talking about it, or software folks say, ah, that’s hardware—I don’t care about it. We have to stop doing that,” Masters said.
Updated with additional detail at 9:39 AM.