16 November 2017
There's been been a bit of a furor recently over what Facebook calls its "Non-Consensual Intimate Image Pilot". Is it a good idea? Does it cause more harm than good?
There's no doubt whatsoever that "revenge porn"—intimate images uploaded against the will of the subject, by an unhappy former partner—is a serious problem. Uploading such images to Facebook is often worse than uploading them to the web, because they're more likely to be seen by friends and family of the victim, multiplying the embarrassment. I thus applaud Facebook for trying to do something about the problem. However, I have some concerns and questions about the design as described thus far. This is a pilot; I hope there will be more information and perhaps some changes going forward.
My evaluation criterion is very simple: will this scheme help more than harm? I'm not asking how effective the scheme is; any improvement is better than none. I'm not asking if Facebook is doing this because they really care, or because of external pressure, or because they fear people leaving their platforms if the problem isn't addressed. Those are internal questions; Facebook as a corporation is more competent to evaluate those issues than I am.
There are two obvious limitations that I'm very specifically not commenting on: first, that Facebook is only protecting images posted on one of their platforms, rather than scouring the web; second, that the victim has to have a copy of the images in question. Handling those two cases as well would be nice—but they're not doing it, and I will not comment here on why or why not, or on whether they should.
I should also note that I have a great deal of respect for Facebook's technical prowess. It is somewhere between quite possible and very probable that they've already considered and rejected some of my suggestions, simply because they don't work well enough. More transparency on these aspects would be welcome, if only to dispel people's doubts.
The process, as described, involves the following steps. My comments on each step are indented and in italics.
A person concerned about some images fills out a form on an
(Australian) government web site.
It is unclear to me why the original notification has to be to a government office. There may be legal reasons for this; some commentators have noted if an underage person submits an explicit picture of themself, it could technically be treated as transmission of child porn.
That person then sends themselves the image via Messenger.
Why submit the image via a message to oneself? An explicit button in their apps or website would seem simpler and less subject to error. (Have you ever sent an email or text message to the wrong person by mistake? I have.)
- The government office notifies Facebook of the submission.
A Facebook employee reviews the image and "hashes" it.
The human review issue has drawn the most comments. Is such viewing itself exploitive?
Human intervention is, I fear, almost certainly necessary. The obvious reason is to prevent someone from submitting other images, e.g., images of their ex's current partner. But what are the criteria Facebook is using? Does the image have to be verifiably of the submitter? How will Facebook determine that? Face-matching, whether automated or human, is decidedly imperfect. Furthermore, there can certainly be sensitive, intimate images that do not contain someone's face—think of a distinctive tattoo in a private place.
It may be possible and desirable to split the vetting process into two steps: confirmation of identity and confirmation of subject matter. The trick is facial identification. Recognizing the presence of a face is off-the-shelf technology at this point; many cameras do it. (I have a camera that can be configured to take a picture only when the subject smiles!) So: isolate the faces, black out the rest of the image, and use that for identity verification. A second person could verify subject matter, but with the faces obscured.
Yes, this process is imperfect. Match failures or a request by the submitter could result in human review of the entire image.
On a separate note, some people may want the ability to specify the gender of the reviewer. (Yes, I know that gender is non-binary and otherwise complicated.)
The hash, not the image, is stored. All images posted to any
Facebook platform are matched against the database of hashed
images; if there's a match, the image cannot be posted.
It is good that Facebook does not want to store images. They're excellent at security, but probably not perfect, and such a database would be a very attractive target for some people.
The "hash", by the way, is not a standard cryptographic hash, which would only work for exact matches. Facebook is using something resilient against cropping, rescaling, etc.
The original complainant is notified of the completion of
this via "the secure email they provided to the eSafety Commissioner's
office" and advised to delete the submitted image from Messenger.
After that, Facebook will delete it from its servers.
"Secure email"? What's that? For most people, there is no such thing. Furthermore, former parters often know or can guess an email account password. I wonder if the submission form urges people to create a new account for just this reason.
The part that concerns me the most is the image submission process. I'm extremely concerned about new phishing scams. How will people react to email messages touting the "new, one-step, image submission site", one that handles all social networks and not just Facebook? The two-step process here—a web site plus an unusual action on Facebook—would seem to exacerbate this risk; people could be lured to a fake website for either step. The experience with the US government-mandated portal for free annual credit reports doesn't reassure me; there are numerous scam versions of the real site. A single-button submission portal would, I suspect, be better. Does Facebook have evidence to the contrary? What do they plan to do about this problem?
There has been criticism of the need for an upload process. Some have suggested doing the hashing on the submitter's device. Facebook has responded that if the hashing algorithm were public, people would figure out ways around it. I'm not entirely convinced. For example, it's been a principle of cryptographic design since 1883 that "There must be no need to keep the system secret, and it must be able to fall into enemy hands without inconvenience."
However… It may very well be that Facebook's hash algorithm does not meet Kerckhoffs's principle, as it is known, but that they don't know how to do better. Fair enough—but at some point, it's not unlikely that the algorithm will leak, or that people will use trial-and-error to find something that will get through. However, under my evaluation criterion—is this initiative better than nothing?—Facebook has taken the right approach. If the algorithm leaks or if people work around it, we're no worse off than we are today. In the meantime, keeping it secret delays that, and if Facebook is indeed capable of protecting the images for the short time they're on their servers (and they probably are) there is no serious incremental risk.
Another suggestion is to delay the human verification step, to do it if and only if there's a hash match. While there's a certain attractiveness to the notion, I'm not convinced that it would work well. For one thing, it would require near-realtime review, to avoid delays in handling a hash match. I also wonder how many submitted images won't be matched—I suspect that most people will be very reluctant to share their own intimate images unless they're pretty sure that someone is going to abuse them by uploading such pictures. By definition, these are very personal, sensitive pictures, and people will not want to submit them to Facebook in the absence of some very real threat.
My overall verdict is guarded approval. Answers to a few questions would help:
- What is the point of the clumsy, multi-step submission process?
- What measures are taken to protect privacy against the human reviewers?
- What are the criteria used by the human reviewers? Does the submitted image have to be verifiably of the submitter? How is this verified if not enough of a face is showing?
- How does Facebook plan to prevent phishing attacks against this scheme?
But I'm glad that someone is finally trying to do something about this problem!
Update: I'm informed that the pilot is restricted to people over 18, thus obviating any concerns about transmission of child pornography.
27 October 2017
I'm currently reading Liza Mundy's Code Girls, a book about the role that American women played in World War II cryptanalysis. (By coincidence, it came out around the same time as The Woman Who Smashed Codes, a biography of Elizebeth Friedman, one of the greatest cryptanalysts in history.) Mundy notes that the attack on Japan's PURPLE machine was aided by a design feature: PURPLE encrypted 20 letters separately from 6 other letters. But why should the machine have been designed that way?
PURPLE, it turns out, was a descendant of RED, which had the same 20/6 split. In RED, though, the 6 letters were the vowels; the ciphertext thus preserved the consonant versus vowel difference from the plaintext. But why was that a desirable goal?
The answer was economy. Telegraph companies of the time charged by the word—but what is a "word"? Is ATOY a word? Two words? What about "GROUP LEADER"? In English, that's two words, but the German "GRUPPENFÜHRER" is one word. Could an English speaker write "GROUPLEADER" instead?
The exact rules were a subject of much debated and were codified into international regulations. One rule that was adopted was to permit artificial words if they were pronounceable, which in turn was instantiated as a minimum density of vowels. So, to save money, the Japanese cryptologists designed RED to keep the (high) vowel density of Japanese as rendered in Romaji.
These rules were hotly debated. One bitter opponent of any such rules was William Friedman, himself a great cryptanalyst (and the husband of Elizebeth) and the administrative head of the US Army group that eventually broke PURPLE.
So: if Friedman's 1927 advice had been followed, RED would not have treated vowels differently, PURPLE wouldn't have had the 20/6 split, and Friedman's group might have been denied its greatest triumph.
16 October 2017
I don't normally blog twice in one day (these days, I'm lucky to post twice in one month), but a nasty thought happened to occur to me, one that's worth sharing. (Thinking nasty thoughts is either an occupational hazard or an occupational fringe benefit for security people—your call…)
I, along with many others, noted that the KRACK flaw in WiFi encryption is a local matter only; the attacker has to be within about 100 meters from the target. That's not quite correct. The attacking computer has to be close; the attacker can be anywhere.
I'm here at home in a Manhattan apartment, typing on a computer connected by wired Ethernet. The computer is, of course, WiFi-capable; if I turn on WiFi, it sees 28 other WiFi networks, all but two of which use WPA2. (The other two are wide open guest networks…) Suppose someone hacked into my computer. They could activate my computer's WiFi interface and use KRACK to go after my neighbors' nets. Better yet, suppose I'm on a low-security wired net at work but am within range of a high-security wireless network.
I'm not certain how serious this is in practice; it depends on the proximity of vulnerable wired computers to interesting WiFi networks. Wired networks are no longer very common in people's houses and apartments, but of course they're the norm in enterprises. If you're a sysadmin for a corporation with that sort of setup, KRACK may be very serious indeed.
16 October 2017
If you work in computer security, your Twitter feed and/or Inbox has just exploded with stories about not just one but two new holes in cryptographic protcols. One affects WiFi; the other affects RSA key pair generation by certain chips. How serious are these? I'm not going to go through the technical details. For KRACK, Matthew Green did an excellent blog post; for the other, full details are not yet available. There are also good articles on each of them. What's more interesting are the implications.
As I've said before about crypto, don't panic. Encryption flaws are sexy and get academics very excited, but they're rarely particularly serious for most people. That's very true here. In fact, at a guess the most widespread problem, with WiFi, will have fewer serious consequences than the RSA problem.
The reason that crypto issues are not in general very serious is that someone who wishes to exploit them needs both the flaw and access—and access is rarely easy. For this new WiFi attack, remember that the range of WiFi is about 100 meters; this is not something that the attackers can do over the Internet. (Yes, with a good, directional antenna you can manage about a kilometer. That's still not much, and since the attack depends on sending a packet to the target machine you need very precise aim at someone's phone or computer.)
There's a really important public policy angle to this, though. We're hearing lots of calls for "exceptional access", a mechanism for lawful government access to encrypted content. I and my colleagues have long warned that this is dangerous because cryptographic protocols are very subtle. In retrospect, this new flaw is blindingly obvious—very bad things happen if you replay message 3 of a 4 message sequence—but it took 13 years for it to be noticed, in a protocol that is used by literally billions of devices. (Btw—by "blindingly obvious" I'm not insulting the discoverer, Mathy Vanhoef. He did wonderful work finding it when no one else had, by asking himself, "I wonder what happens if….".) Oh yes—the protocol was mathematically proven correct—but the proof didn't cover what the attack actually does.
Cryptographic protocols are hard.
So who is affected by this, and what should you do?
The problem is on
the client side; WiFI access points are not affected.
You need to install
software updates on every one of your WiFi devices. Apparently, iOS and
Windows are not as seriously affected, because they didn't completely
follow the (flawed!) spec. Android phones are vulnerable, and are less
likely to be updatable. Internet of Things devices are the most at
risk, given their poor history of being updated.
Again, though, most consumers are not at risk. Businesses are, and ones with many devices, e.g., credit card readers, connected by WiFi have a lot of scrambling to do.
The other flaw appears to be more academically interesting and—for some of those affected—far more serious. Briefly, in the RSA encryption algorithm one has to generate a "public key"; this key is (in part) the product of two large, random primes. We normally write this as
n = pqNormally, n is public; however, p and q must be kept secret.
The problem seems to be in the way p and q were generated. Normally, you generate large, random numbers and test for primality. It appears that the code library used with a particular chip had something wrong with the process for generating primes, resulting in an n that is easy to factor into its constituent p and q. Interestingly, it's possible to detect these weak values of n very cheaply and easily, without trying to factor them.
So—who is affected by this bug? First, remember the access issue. An attacker needs access to your encrypted traffic or encrypted device. That's not easy. Furthermore, if you used 2048-bit keys—and that's been standard for a fair number of years—the attack isn't cheap. On a 1000-core Amazon cloud, it would take 17 days and cost more than US$40,000. Translation: it isn't an attack that can be done casually or against bulk traffic. It's a targeted attack that can be launched only by a well-resourced adversary, and only against a high-value target.
But there is one serious cause for concern. If you have email encrypted with one of these flawed keys, or if you have an electronic document signed with one, someone can attack it in the future—and that $40K cost and 17-day time will only drop.
Update: According to later information, both the access point and the clients must be patched. This is more serious, since many access points are abandonware.
5 October 2017
I wrote an essay on why we can't easily replace social security numbers. You can find it here.
20 September 2017
The other day, I noted that Equifax had been breached in March, and quoted the article as saying that the attackers had been "the same intruders" as in the May breach. In a newer news report, Equifax has denied that:
"The March event reported by Bloomberg is not related to the criminal hacking that was discovered on 29 July," Equifax's statement continues. "Mandiant has investigated both events and found no evidence that these two separate events or the attackers were related. The criminal hacking that was discovered on 29 July did not affect the customer databases hosted by the Equifax business unit that was the subject of the March event."So: I'll withdraw the speculation I posted about this incident confirming one of my hypotheses and wait for further, authoritative information. I repeat my call for public investigations of incidents of this scale.
Also worth noting: Brian Krebs was one of the very few to report the March incident.