February 2015
The Uses and Abuses of Cryptography (5 February 2015)
What Must We Trust? (16 February 2015)
Hiding in the Firmware? (19 February 2015)
Packet Loss: How the Internet Enforces Speed Limits (27 February 2015)

What Must We Trust?

16 February 2015

My Twitter feed has exploded with the release of the Kaspersky report on the "Equation Group", an entity behind a very advanced family of malware. (Naturally, everyone is blaming the NSA. I don’t know who wrote that code, so I’ll just say it was beings from the Andromeda galaxy.)

The Equation Group has used a variety of advanced techniques, including injecting malware into disk drive firmware, planting attack code on "photo" CDs sent to conference attendees, encrypting payloads using details specific to particular target machines as the keys (which in turn implies prior knowledge of these machines’ configurations), and more. There are all sorts of implications of this report, including the policy question of whether or not the Andromedans should have risked their commercial market by doing such things. For now, though, I want to discuss one particular, deep technical question: what should a conceptual security architecture look like?

For more than 50 years, all computer security has been based on the separation between the trusted portion and the untrusted portion of the system. Once it was "kernel" (or "supervisor") versus "user" mode, on a single computer. The Orange Book recognized that the concept had to be broader, since there were all sorts of files executed or relied on by privileged portions of the system. Their newer, larger category was dubbed the "Trusted Computing Base" (TCB). When networking came along, we adopted firewalls; the TCB still existed on single computers, but we trusted "inside" computers and networks more than external ones.

There was a danger sign there, though few people recognized it: our networked systems depended on other systems for critical files. In a workstation environment, for example, the file server was crucial, but it as an entity wasn’t seen as part of the TCB. It should have been. (I used to refer to our network of Sun workstations as a single multiprocessor with a long, thin, yellow backplane—and if you’re old enough to know what a backplane was back then, you’re old enough to know why I said "yellow"…) The 1988 Internet Worm spread with very little use of privileged code; it was primarily a user-level phenomenon. The concept of the TCB didn’t seem particularly relevant. (Should sendmail have been considered as part of the TCB? It ran as root, so technically it was, but very little of it actually needed root privileges. That it had privileges was more a sign of poor modularization than of an inherent need for a mailer to be fully trusted.)

The National Academies report Trust in Cyberspace recognized that the old TCB concept no longer made sense. (Disclaimer: I was on the committee.) Too many threats, such as Word macro viruses, lived purely at user level. Obviously, one could have arbitrarily classified word processors, spreadsheets, etc., as part of the TCB, but that would have been worse than useless; these things were too large and had no need for privileges.

In the 15+ years since then, no satisfactory replacement for the TCB model has been proposed. In retrospect, the concept was not very satisfactory even when the Orange Book was new. The compiler, for example, had to be trusted, even though it was too huge to be trustworthy. (The manual page for gcc is itself about 90,000 words, almost as long as a short novel—and that’s just the man page; the code base is far larger.) The limitations have become painfully clear in recent years, with attacks demonstrated against the embedded computers in batteries, webcams, USB devices, IPMI controllers, and now disk drives. We no longer have a simple concentric trust model of firewall, TCB, kernel, firmware, hardware. Do we have to trust something? What? Where do we get these trusted objects from? How do we assure ourselves that they haven’t been tampered with?

I’m not looking for concrete answers right now. (Some of the work in secure multiparty computation suggests that we need not trust anything, if we’re willing to accept a very significant performance penalty.) Rather, I want to know how to think about the problem. Other than the now-conceputal term TCB, which has been redfined as "that stuff we have to trust, even if we don’t know what it is", we don’t even have the right words. Is there still such a thing? If so, how do we define it when we no longer recognize the perimeter of even a single computer? If not, what should replace it? We can’t make our systems Andromedan-proof if we don’t know what we need to protect against them.

https://www.cs.columbia.edu/~smb/blog/2015-02/2015-02-16.html