Michael W. Lucas: Who is Colin Percival, and why should we listen to him?
Colin Percival: To the first question: I'm a visiting researcher at Simon Fraser University in Canada, as well as being a deputy security officer with the FreeBSD project. I started my B.S. in mathematics at age 13 (concurrent with high school) and graduated with a first-class honors degree in 2001; to the extent possible within the confines of an undergraduate degree program, I had an emphasis on number theory and computer algorithms. I then went to Oxford University, where I recently defended my doctoral thesis in computer science.
To the second question: you should listen to me because I have written a 12-page academic paper presenting and discussing a serious security vulnerability, and nobody has been able to refute my results. I believe that my work stands on its own; it doesn't need my name attached to give it credibility.
MWL: I saw your hyperthreading presentation at BSDCan, where you demonstrated weaknesses with the combination of cryptographic software and hyperthreading. Can you explain your work in a way that those folks who aren't cryptographers, and who find that 12-page paper intimidating, can understand it?
CP: Well, this is a bit of a strained analogy, but here goes. Imagine that you work in computer technical support. You're really good at your job, but you can never answer any questions on your own--instead, you have a collection of really good computer manuals spread out on your desk, and every time someone asks you a question, you have to refer to the appropriate manual to work out what the answer is.
Now, because you're so good at your job, you don't just help one customer at once; instead, you have two phones, and you try to help two customers at once, alternating back and forth between them. If the two people you are helping are asking about completely different problems, then you can have both manuals open to the right pages, and this works very well; but if you've got two people who need answers from the same manual, then you have to spend time flipping the pages back and forth, which means that it takes far longer for you to answer each question.
Since the two people you are helping can measure how long it takes you to answer their questions, they can determine if you have to flip pages--in other words, they can determine if the other customer is asking you questions for which you need to refer to the same manual in order to answer.
In my paper, "you" are the CPU, the two people you are helping are two computer programs, and the reference manuals you are using is the main memory of the computer. By measuring how long it takes for the CPU to perform some calculations, a program can work out which parts of the computer's main memory are being used by the other program. Once it knows this ... well, it's hard to explain further without explaining technical details about how encryption works, so let it suffice to say that knowing which memory locations are being accessed while some data is being encrypted or decrypted can very often tell you what the data, or the secret key being used, is.
MWL: A real system has a lot of things going on at the same time, though. To stretch your analogy even further, you don't have two callers on the line--you have dozens, or even hundreds. You were able to recover hundreds of bits from a key in your environment, but it appears, to me at least, to be a bit different from recovering an actual key on an actual system in the real world; even the lowest-use system has a lot of processes scheduled. Isn't this vulnerability largely theoretical?
CP: You're stretching the analogy in the wrong direction. You're right that there may be dozens or even hundreds of programs running on a busy system--or, in our analogy, that many customers phoning for technical support--but what matters is how many programs are running at once. On a hyperthreaded CPU, there are only ever two programs being run at any given time; all the rest are waiting for their turn (again, much like technical support lines!).
All an attacker needs to do is get his code running on the CPU at the same time as the program he is trying to spy on--and he only needs to do this once. If he doesn't succeed on his first attempt--that is, if some other program is selected to run at the same time as his target--then he can just try again. A computer that is running at 90 percent of capacity is a very heavily loaded system, but even there the attacker would only need to make ten attempts on average before he succeeded; and each attempt takes a fraction of a second and is for all practical purposes undetectable.
In short, the vulnerability is entirely practical. In my paper I make some assumptions for the purposes of making it easier to explain and demonstrate the attack, but the difficulties an attacker would encounter in the real world are far from insurmountable.
MWL: If I recall correctly, at BSDCan you said that you expected someone with the proper background could write a functional exploit for this vulnerability in only a few days. It's been a few weeks now; have you seen such an exploit yet?
CP: No; but if someone was planning on exploiting this, I'm sure they wouldn't announce it to the world by publishing their code.
MWL: Well, when an exploit comes out I'm sure you'll hear about it, probably by the guy who has to clean it up.
You had your own ideas about how people could use this. I'm sure that other people have given you even more ideas. What other interesting, surprising, or creative ideas have other people given you for exploiting hyperthreading?
CP: Well, I demonstrated an attack against RSA using hyperthreading; Osvik and Tromer have also performed an attack against AES. (I don't know all the details, since they haven't published their work yet.) Beyond these two cryptographic attacks, it becomes harder to define what constitutes an "exploit"--clearly stealing someone's keys is a problem, but what about watching the pattern of their keystrokes when they type in their password? What about determining which records are being accessed in a database? What about distinguishing between someone running vi and someone running emacs? The way software is designed at the moment more or less guarantees that information will be leaked on a hyperthreaded processor--the only question is how much people care.
The most interesting feedback I've received has not been about new ways of exploiting hyperthreading, but instead has been historical: it seems that several people independently investigated the possibility of information leakage via caches over 20 years ago, but were "discouraged" from further research by the NSA before they obtained any significant results.
MWL: If I'm understanding you correctly, you think the line between an exploit and an interesting-but-unimportant security bug is personal. Obviously we all care when the script kiddies have their point-and-click RSA-smasher, but there are many degrees of gray beneath that level. Someone may consider your HT vulnerability moot, but others will find it very serious indeed. This is something we don't hear enough about in security--it's all relative to your situation.
CP: I wouldn't say that the line between "dangerous" and "interesting but unimportant" is personal, since that implies that there is a subjective element to security analysis. Rather, I'd say that the line between "dangerous" and "interesting but unimportant" depends upon whether you are directly affected. You could think of it as being like an automobile accident--if you never leave home, you don't need to worry about being in an automobile accident, but if you do leave home, it's something that you need to take precautions against (e.g., learning to drive safely).
MWL: At BSDCan, you mentioned that you had learned a lot of new and interesting things about yourself in the interval between announcing the paper and actually releasing it. Someone had posted that you had a vendetta against Intel, and someone else said that you were manipulating the stock price for personal gain. How much of that reaction does a security researcher put up with when he announces his results?
CP: Well, I should start by clarifying here that the rumors were largely my fault, since they resulted from the unusual disclosure schedule for this issue. The time at which I released all the details of the attack was set by the conference schedule--I was giving the first talk of the conference, at 10 a.m. EDT on Friday the 13th--but I wanted to provide everybody with a chance to fix this on Friday regardless of which time zone they happened to be in, so at 8 p.m. EDT the preceding evening I announced that an unspecified problem existed and that I would be releasing the details the following morning.
This window of 14 hours resulted in some rather interesting rumors flying around, since the only evidence people had on which to judge my claims was my reputation together with the FreeBSD security advisory that was sent out. (No other advisories were released until SCO sent out their advisory on the 13th.) Some people claimed that this was all a hoax; others, pointing to the FreeBSD security advisory and the lack of Linux security advisories, said that "obviously, Linux is not affected"; but the most amusing claim I heard came from someone who pointed to my web page where I describe myself as being "unemployed [because] I wanted to spend my time making sure that this issue was properly fixed, rather than earning some money" and concluded that I was "a disgruntled unemployed programmer, trying to make some money by short-selling Intel shares." (Coincidentally, someone did sell 10 million Intel shares at 10:56 a.m. on May 12--but according to reports I've read, this was simply a trading error, where the intended sale had been of 10,000 shares.)
In general, security researchers don't have to put up with very much along these lines. In this particular case, I had to put up with quite a lot of rumors--but I didn't really mind, since they provided some valuable comic relief, and once I published the details of my attack, nearly everyone accepted that it was valid immediately.
MWL: I saw advisories from SCO and from FreeBSD, but I'm sure you contacted other vendors: Microsoft, Sun, Red Hat, SGI, and so on. How did people react; did they take you seriously or blow you off? How did they work with you? I'm especially curious about how Intel reacted to being told that there is a basic problem with their design.
CP: In general, I had no trouble convincing security people to take this problem seriously. As one person put it, "Anything which comes from the FreeBSD Security team immediately has an air of credibility to it," and while I was reporting this problem in my personal capacity as a security researcher, the fact that I am part of the FreeBSD security team certainly helped. Beyond that, the individual reactions were quite different; but rather than addressing each vendor's response individually and in detail, which would take many pages, I'll just give the highlights--in the form of awards for exceptional performances.
The prize for most professional response goes to SCO. I must admit to having been rather surprised by this, in light of the public disagreements between SCO and the free software community, but SCO's response to this issue was really quite superb. Out of all the members of the Linux vendor security list, SCO was the first to request further details after I posted to indicate that there was a problem; they were the first to respond back with detailed and intelligent questions; when I asked for vendor statements, they were the first (and only Linux) vendor to respond; and they published an advisory only a few hours after the embargo on the issue ended.
The prize for most corporate attitude goes to Intel. I had some trouble establishing contact with them in the first place--not that I can assign much blame for this, since Intel, unlike operating system vendors, has not had much experience in dealing with security flaws--but even once I found someone who was willing to talk to me, our conversations were rather less than useful: as a general rule, I would ask questions (e.g., Would it be possible for you to produce a microcode patch as follows ... or How about making the following changes in future processors ...), and the reply would invariably be "I'm sorry, but I'm not allowed to talk about that." Worse, once it became clear that my recommendation--and FreeBSD's response--was going to be to disable hyperthreading by default, Intel shifted completely into damage control mode, discarding all attempts at a reasoned security-centric response in favor of treating this simply as a public relations exercise.
The prize for most personally helpful goes to Mike O'Connor of SGI. As little communication as I had with Intel, I'm sure I would have had even less were it not for Mike's help: when I explained to him the difficulties I was having with Intel, he took advantage of the established channels that SGI had, by virtue of being a large customer, to remind Intel that it was important to talk to people who discover security vulnerabilities.
The prize for least communicative goes to Microsoft. I was very amused recently to read the following in a story on eweek.com:
"We respond immediately to the initial vulnerability report and provide the researcher with contact names, e-mail addresses and phone numbers. We make it clear we want to work closely with the researcher to pinpoint the problem and get it fixed. We commit to providing [researchers] with a progress report on the Microsoft investigation every time they ask for one," [MSRC program manager Stephen Toulouse] said.
My experience with Microsoft was quite the opposite. When I first reported this vulnerability to Microsoft, I was thanked, given a ticket number (5834), and told that it would be handled by "Christopher"--no last name, no phone number, and no direct email address. Later the issue was transferred to "Brian"--but again, no contact information was provided. Despite comments from multiple third parties that Microsoft was "very concerned" and had "several people" working on this issue, Microsoft did not "make it clear they wanted to work closely" with me--in fact, they ignored all my attempts at cooperation. Finally, when I sent emails to Microsoft asking for a progress report, I received no response. Even now, a month after I published the details of this vulnerability, I have received no communication from Microsoft to say if--let alone how--they intend to respond to this issue.
Finally, the head in the sand prize goes to Linus Torvalds. On Monday, May 16, three days after I published all the details of my attack, Linus wrote that he would "be really surprised if somebody is actually able to get a real-world attack on a real-world pgp key usage or similar out of it (and as to the covert channel, nobody cares). It's a fairly interesting approach, but it's certainly neither new nor HT-specific, or necessarily seem all that worrying in real life." I really don't know where to start with this, except perhaps to say that I'm very glad that Linus isn't responsible for keeping my computer secure.
MWL: SCO gave the best response? I'm sure a lot of people will find that surprising. I guess it just demonstrates that the people doing the work aren't the same people that are making policy, and that the vendors who aren't taking it seriously will find out just how real-world this can be.
CP: I think it's a bit more subtle than that. SCO is the heir to Unix, so they've had a lot of time to mature; and their customers are probably highly weighted toward the "upgrade once a decade," hyperconservative server end of the spectrum--which is exactly where this is the most dangerous. Companies tend to adopt the same attitudes as the people who buy their products, so I'm not at all surprised that a company that deals with very conservative server-buying customers had a far better response than a company which deals mostly with security-unaware desktop users.
MWL: To an outsider looking in, it seems that this took a long time to work out. A layman might think that you would just have an idea, hammer out some demo code, mail some vendors, and be done with it. I'm sure it's not that simple. What does a security researcher actually do on a day-to-day basis?
CP: Well, I'm not a very good example of a security researcher, in that respect. Most security researchers--or at least, most people who call themselves security researchers--spend most of their time combing through source code looking for bugs. This is certainly useful, but it doesn't require very much skill, and I suspect that this is a task that will be taken over by computer programs before long, since most security flaws fall into the "stupid mistake" category and are very easy to recognize if you look closely enough; I've been particularly impressed in this respect with results from software produced by Coverity.
As for what I think your real question was--why it took such a long time before I announced the problem--well, it all comes down to lots of details. To start with, when I first realized that this was likely to be a problem, I was in the middle of editing my D.Phil. thesis prior to sending it off to my examiners. While walking to and from college every day--I was living in a house about 2 kilometers outside of Oxford, and without an internet connection--I convinced myself that the problem was probably real, but it wasn't until over a month later, when I went back to Vancouver for Christmas, that I had time to sit down and write some code.
By the end of 2004, I had some working code and I had demonstrated that it could steal enough information to make breaking RSA easy, but this wasn't enough to write to people yet. Extraordinary claims require extraordinary evidence, and while I had the necessary evidence, it was scattered between dozens of files and scraps of paper, and nobody would have the patience to read and understand it all it its current form. Consequently, I started to write my paper, "Cache missing for fun and profit," in order to clearly explain why there was a covert channel between threads executing on the same processor core, how this channel could be exploited as a side channel, and how it was possible to defend against this attack.
I finished a first draft of this at the end of February--after being interrupted partway through by my thesis defense--and started writing to security people to inform them of this vulnerability. For the next two months, my role became less that of a researcher and more that of an educator: while my paper was largely self-explanatory, for nearly every point I made there was at least one person who needed me to provide additional explanation, and there were many things--potential fixes, for example, which I had decided wouldn't actually work--that I didn't mention in my paper but still had to explain to several people.
Toward the end of April, I went through my paper making substantial revisions, based on feedback from the various security teams with which I had been in contact, and then I prepared the patch for FreeBSD--only to end up feverishly rewriting the patch during the FreeBSD developer summit the day before my talk, after I realized that my original patch would had inadvertently ended up disabling dual-core systems.
Of course, while I was doing all of this, daily life continued as usual. As a FreeBSD deputy security officer, I was responsible for dealing with the more common "dumb bug" sort of security issues, so I wrote patches and advisories for half a dozen other security problems (including one I found myself) during the period that I was working on this.
I guess the most important point to realize here is that it's one thing to realize that something might be a problem, and another to write the code to exploit it; but it is quite different, and a lot more work, to liaise with over a dozen vendors to explain what the problem is and how it should be fixed.
MWL: It sounds as if it would have been easier to write an exploit for this than to bring this to vendors correctly as you did!
CP: Quite likely, yes. Of course, considering the wide range of operating systems affected by this, I wouldn't want to distribute exploit code even if I had written it, so the route I took was the only reasonable approach to ensuring that everybody could fix the problem (even if some of them seem to have not bothered).
MWL: You're also a part of the FreeBSD security team. I'm sure you get a fair number of emails from panicked users, false bug reports, "security holes" that are actually the result of incorrect sysadmin practice, and so on. What's it like being on the FreeBSD security team? Are there any choice tidbits you'd care to share from that experience?
CP: Many people have said that war is "months of boredom punctuated by a few minutes of hell"; being on the FreeBSD security team isn't quite that intense, but there is certainly a similarity. Most of the time, there aren't any major problems that need to be dealt with, but we never know when the next big attack is going to happen. Of course, that's only one side of the security work I do with FreeBSD; in addition to handling security issues as they are found, I've spent much of the past two years improving our infrastructure for distributing updates. It doesn't help very much to discover and produce a patch for a security flaw if none of your users actually apply the patch, for example, so I wrote a tool called FreeBSD Update that allows people to securely download and install security updates very easily.
As for choice tidbits: there are inevitably some amusing things that happen, but I'd rather not give specifics; people have confidence that they can write to the security team to tell us about potential problems without having their correspondence discussed outside of the security team, and I don't want to betray that trust.
MWL: Your updating tool sounds promising; would you care to give our readers a brief description and tell us what problems you're trying to solve?
CP:Sure. To start with, you have to understand that FreeBSD is an open source operating system; all of the source code used to build it is available, and anyone who knows what they are doing can take this source code, make changes to it, recompile it, and thereafter run their own personal version of FreeBSD.
This is all great, but it leads to a certain disconnect between the
developers--for whom "remove
| S_IRGRP | S_IROTH from line 105
of sys/dev/iir/iir_ctrl.c and recompile" makes sense--and normal
users, who aren't interested in writing code but instead simply want a system
that works. Historically, security issues have been handled in FreeBSD by
distributing a source code patch along with a list of instructions for applying
the patch and rebuilding; this had the effect that most users would decide that
applying the security patches was far too much work, and would instead simply
leave their systems insecure. A doctor once remarked to me that it's one thing
to diagnose an illness and another to prescribe the correct treatment, but
that the really hard part is making sure that a patient actually takes the
medicine--the situation is exactly the same with security issues, in that the
really hard part is to make sure that users actually keep their systems up to
This is where FreeBSD Update comes in. I take the source code patches that fix security problems, and recompile everything; then I check to see which files have changed, package the new files together, and put them online for people to download. Instead of needing to apply the source code patch and recompile everything--which can take well over an hour on a slow machine--users simply run the FreeBSD Update client, and watch as it downloads any necessary security updates for them--very much the same way as Windows Update works.
Incidentally, there has been one very interesting spin-off benefit of my work on FreeBSD Update: In order to speed up the downloading of binary security updates, I wrote a simple "delta compression" program, which compares two files and outputs a small "binary patch," which encodes the difference between them. Since people normally have an old, insecure version of the files for which they are downloading security updates, it is usually possible to download this much smaller patch file. Much to my surprise, I discovered that the patches my utility was producing were several times smaller--and thus faster to download--than patches produced by any other available software. Since then, my binary delta compression utility, bsdiff, has been increasingly widely used, most notably in OS X (for distributing software updates) and in the next major release of Mozilla Firefox.
MWL: You've provided ideas for people looking to make crypto useless on hyperthreaded machines, created software used by major companies to reduce the cost and complexity of software updates, and nailed down your Ph.D. What's next?
CP: As I think I mentioned before, my attack is just the latest in a long series of side-channel attacks on cryptographic implementations. The unfortunate fact is that the existing libraries of cryptographic functions were either written before side-channel attacks became well known or were written by people who were largely unaware of the problem. Each time that a new attack is published, cryptographic libraries are rewritten to protect against that latest attack, but there has been no attempt made so far to produce a library that is designed from the ground up to be immune to entire classes of side-channel attacks. As I mention in my paper, this can be done by adopting a more restricted model of computation, which prohibits data-dependent branches or memory accesses. Assuming that I can find some funding to support this, I'd like to write such a library.
In the longer term ... well, when Guy Consolmagno was appointed the Vatican astronomer, he was given three words of instruction: "Do good science." That's the sort of job I'd love to have: one with no strings attached, nobody telling me what research I should or should not be doing, and nobody telling me what I can or cannot publish--just absolute freedom to do research. Realistically, I suspect that the closest I'm ever likely to get to that ideal is if I get a tenured appointment at a university, so that's where I'm currently hoping to end up; but wherever I go, I intend to continue doing research. The problems I do research into are often inspired by practical issues, as with my delta compression work, so it is quite likely that I'll continue to produce useful software; but that is simply a side benefit. My real interest is the research.
MWL: Good luck, and thanks for your time.
Michael W. Lucas
Return to the ONLamp Security DevCenter.
Copyright © 2009 O'Reilly Media, Inc.