There’s a bit of a debate going on about whether the Kaseya attack exploited a 0-day vulnerability. While that’s an interesting question when discussing, say, patch management strategies, I think it’s less important to understanding attackers’ thinking than target selection. In a nutshell, when it comes to target selection, the attackers have outmaneuvered defenders for almost 30 years.
In early 1994, CERT announced that attackers were planting network eavesdropping tools—"sniffers"—on SunOS hosts, and were using these to collect passwords. What was known in the security community but not mentioned in the CERT advisory was how strategically these sniffers were placed: on major Ethernet segments run by ISPs. Back then, in the era before switched Ethernet was the norm, an Ethernet network was a single domain; every host could see every packet. That was great for network monitoring—and great for surreptitious eavesdropping. All security people understood both halves of that equation, but the attackers realized that backbone links offered far more opportunities to collect useful passwords than ordinary sites’ networks.
Fast-forward a decade to 2003, when the Sobig.f virus made its appearance. In the words of a retrospective look, "[Tt]he whole Sobig family was incredibly significant because that was the point where spam and viruses converged." Again, that hacked computers could be used for profit wasn’t a new idea, but few defenders realized until too late that the transition had taken place in the real world.
A few years later, credit card payment processors were hit, most famously Heartland Payment Systems. This is an industry segment most people didn’t realize existed—weren’t charges and payments simply handled by the banks? But the attackers knew, and went after such companies.
The current cast of malware operators are again ahead of the game, by going after cyber-infrastructure companies such as SolarWinds and Kaseya. This gives them leverage: go after one company; penetrate thousands. The payoff might be intelligence, as in the SolarWinds hack, or it might be financial, as with Kaseya. And the ransomware perpetrators are apparently being very strategic:
Cyber-liability insurance comes with a catch, however: It may make you more vulnerable to a ransomware attack. When cyber-criminals target cyber-insurance companies, they then have access to a list of their insured clients, which cyber-criminals can then use to their advantage to demand a ransom payment that mirrors the limit of a company’s coverage.Attackers even exploit timing:
DarkSide, which emerged last August, epitomized this new breed. It chose targets based on a careful financial analysis or information gleaned from corporate emails. For instance, it attacked one of Tantleff’s clients during a week when the hackers knew the company would be vulnerable because it was transitioning its files to the cloud and didn’t have clean backups.
The challenge for defenders is to identify what high-value targets might be next. A lot of defense is about protecting what we think are high-value systems, but we’re not always assessing value in the same way as the attackers do. What’s next?
Anyone who works in privacy is familiar with the term "data shadow": the digital record created by our transactions, our travels, our online activities. But where did the phase come from? Who used it first?
A number of authors have attributed it to Alan Westin, whose seminal book Privacy and Freedom (largely a report on the work of the Committee on Science and Law of the Association of the Bar of the City of New York) set the stage for most modern discussions of privacy. I followed the lead of some other authors, e.g., Kevin Keenan’s 2005 book Invasion of Privacy: A Reference Handbook in saying that the phrase appeared in Privacy and Freedom. Simson Garfinkel’s 2000 book Database Nation is more circumspect, attributing it to Westin in the 1960s, but not citing a particular work. However, repeated searches of Westin’s works do not turn up that phrase.
Other authors point to different origins. Harvey Choldin says that the term is used by German sociologists, and cites an interview. Can we push it back further?
Google Scholar turned up a 1983 reference in Politics on a Microchip by Edwin Black. A Google Ngram search by Garfinkel shows nothing before 1976. But a more complete Google search turned up a 1974 usage by Kirstein Aner [sic] in New Scientist. They quote her as saying
Every single person will be closely followed all his life by his own data shadow where everything he has ever done, learnt, bought, achieved, or failed to achieve is expressed in binary numbers for eternity. His shadow will often be taken for himself and we will have to pay for all the inaccuracies as if they were his own; he will never know who is looking at is shadow or why.Kerstin Anér was a member of the Swedish legislature.
At that point, I asked a Swedish friend for help. He found some useful references to "dataskugga". Page 231 of Kretslopp Av Data says, via Google Translate:
By the term "data shadow" Kerstin Anér meant "the two-dimensional image of the individual that is evoked through one or more data registers" or "the series of crosses in different boxes that a living person is transformed into in the data registers and which tends by those in power to be treated as more real and more interesting than the living man himself.The footnote for the quote points to Anér’s 1975 book Datamakt (Computer Power) and says (again via Google Translate):
Anér referred to in the article [unsigned], “Data law must protect privacy. But we are still sold as issues ”, Dagens Nyheter, 13 July 1973; Ten years later, the term was still in use: Calle Hård, "The story of the data shadow 600412– 5890", Aftonbladet, 11 March 1982.
There was one more tantalizing reference. Google Books turned up a reference to "data shadow" in OECD Informatics with a possible date of 1971. When I finally obtained access to Volume 12, I found that it was from 1978—in an article by Anér.
That’s as far back as I’ve been able to trace it: Kerstin Anér, probably in 1972 or 1973. Does anyone have any earlier uses, in any language?
Acknowledgments: Patrik Fältström did the Swedish research. Tarah Wheeler and Matthew Nuding tracked down the OECD volume. Simson Garfinkel also did assorted web searches.
As most Americans who have received a Covid-19 vaccination—or who have tried to schedule one—know, attempting to make an appointment is painful. Here is a thread describing the ordeal in Maryland, but it’s broadly similar elsewhere. In New York City, where I live, vaccines are administered by the city, the state, several different pharmacy chains, countless local pharmacies, some major medical centers, and more, and you have to sign up with each of them separately to see if any appointments are available. Here, the situation was so crazy that unofficial meta-sites have sprung up. The New York City Council passed a law requiring the city to create an official central web site.
It didn’t have to be this way, if proper planning and development had started last year, at the national level, but of course it didn’t. The answer is simple, at least conceptually: standardized interfaces, sometimes called APIs (application program interfaces) or network protocols.
Every large medical provider has its own scheduling system. These exist, they’re complex, they may be linked to patient records, staff availability, etc. That includes pharmacy chains—I had to use such a system to schedule my last flu shot. It isn’t reasonable to ask such organizations to replace their existing scheduling systems with a new one. That, however isn’t necessary. It is technically possible to design an API that a central city or county scheduling web page could implement. Every vaccination provider in the jurisdiction could use their half of the API to interface between their own scheduling systems and the central site. The central site could be designed to handle load, using well-understood cloud computing mechanisms. Indepdendent software vendors could develop scheduling packages for non-chain pharmacies that don’t have their own. It all could have been very simple for people who want to be vaccinated.
Mind you, designing such an API is not a trivial task. Think of all of the constraints that have to be communicated: age, job, residence, place of work, what eligibility categories a provider will accept, and more. There need to be mechanisms for both individuals to schedule shots for themselves or their families, and for operators of city-run phone banks to make appointments for the many people who don’t have computer access of their own, or who can’t cope with a web site that may have been poorly designed. You need logins, "I forgot my password", wait lists, people who can show on short notice lists, cancellation, queries, and more. Nevertheless, this is something that the tech industry knows how to do. (I used to be part of the leadership of the Internet Engineering Task Force, which standardizes many of the network protocols we all use. That’s why I, on a Mac, can send email via a Linux web server to a Windows user, which they may view using a platform-independent web browser: there are standardized ways to do these things. Protocols can be simple or complex, but they’re routinely designed.)
And once you have an API, software developers have to implement it, for central sites, for small providers, and as interfaces to existing scheduling systems. That latter is itself likely to be a complex task, since you’re touching a large, probably messy codebase. Nevertheless, this is all routine in the software business.
All of this could have been done. It should have been done; the situation—high demand for a limited resource, with various priority levels—was predictable long ago. As soon as it was apparent that the world was confronting a deadly pandemic that could be dealt with by a vaccine—that SARS-CoV-2 had that spreading potential was understood by some in January 2020, and Moderna had its vaccine designed two days after they received the virus’ genome sequence—the Department of Health and Human Services should have convened an API design group. In parallel with that, developers could have started on their own software while waiting for the API specs to be finalized. I’ll take it a step further: we should still go ahead with this effort. We may need it for booster shots or to cope with new variants; we’ll certainly need it for the next pandemic.
And we may need something similar now for vaccination passports. There is not yet a consensus on their desirability; that said, many places already require them. Almost certainly, the little card you got from your provider is inadequate in the long run. You’ll have to request the standardized version, either on paper or for the many different apps you’ll likely need—and that many need a similar API.
In the wake of recent high-profile security incidents, I started wondering: what, generally speaking, should an organization’s security priorities be? That is, given a finite budget—and everyone’s budget is finite—what should you do first? More precisely, what security practices or features will give you the most protection per zorkmid? I suggested two of my own, and then asked my infosec-heavy Twitter feed for suggestions.
I do note that I’m not claiming that these are easy; indeed, many are quite hard. Nevertheless, they’re important.
I started with my own top choices.
- Install patches
- Few organizations are hit by zero-days. That doesn’t mean it can’t happen, and of course if you’re the target of a major intelligence agency the odds go up, but generally speaking, zero-days are far from the highest risk. Installing security patches promptly is a good defense.
- Use 2FA
- Good two-factor authentication—that often means
FIDO2—is basically a necessity for access from the outside.
Ordinary passwords can be phished, captured by keystroke
loggers, guessed, and more. While there are many forms of
2FA, FIDO2 ties the session to the far side’s identity,
providing very strong protection against assorted attacks.
This is harder to pull off internally—some form of single sign-on is a necessity; no one wants to type their password and insert their security key every time they need to open a file that lives on a server.
The next suggestion is one I should have thought of but didn’t; that said, I wholeheartedly agree with it.
- Know what computers you have, and what software they run. If you don’t know what you’ve got (and who owns it), you don’t know whom to alert when a security vulnerability pops up. Consider, for example, this new hole in VMware. Do you know how many VMware servers you have? If you ran the corporate security group and saw that alert, could you rapidly notify all of the responsible system administrators? Could you easily track which servers were upgraded, and when?
The next set of answers have to do with recovery: assume that you will suffer some penetration. Now what?
- Have good backups, and make sure that at least one
current-enough set is offline, as protection against
I would add: test recovery. I’ve seen far too many situations where backups were, for some reason, incorrect or unusable. If you don’t try them out, you have no reason to think that your backups are actually useful for anything.
- If you don’t have good logs, you won’t know what
happened. You may not even know if anything
What should you log? As much as you can—disk space is cheap. Keep logs on a locked-won log server, and not on a machine that might otherwise be targeted. Logs from network elements, such as switches and routers (hint: Netflow), are especially valuable; ditto DNS logs. These show who has connected where, invaluable if you’re trying to trace the path of an intruder.
- Your internal network should be segmented. The initial
point of entry of an attacker is probably not
their actual goal, which implies the need for lateral
movement. Internal segmentation is a way to hinder
and perhaps detect such movements.
Note that segmentation is not just firewalls, though those are probably necessary. You should also have separate authentication domains. Look at what happened to Maersk in the NotPetya attack:
Maersk’s 150 or so domain controllers were programmed to sync their data with one another, so that, in theory, any of them could function as a backup for all the others. But that decentralized backup strategy hadn’t accounted for one scenario: where every domain controller is wiped simultaneously. “If we can’t recover our domain controllers,” a Maersk IT staffer remembers thinking, “we can’t recover anything.”
After a frantic search that entailed calling hundreds of IT admins in data centers around the world, Maersk’s desperate administrators finally found one lone surviving domain controller in a remote office—in Ghana. At some point before NotPetya struck, a blackout had knocked the Ghanaian machine offline, and the computer remained disconnected from the network. It thus contained the singular known copy of the company’s domain controller data left untouched by the malware—all thanks to a power outage. “There were a lot of joyous whoops in the office when we found it,” a Maersk administrator says.
- Intrusion detection
- Monitor for attacks. There have been far too many cases—think Sony and Equifax—where the attackers were inside for quite a while, exfiltrated a lot of data, and weren’t detected. Why not? Note that your logs are a very useful source of information here—if you actually have the tools to look at them automatically and frequently.
- Harden machines for outside life
- That is, assume that some machines are going to be
seriously attacked as if it were on the open
Internet. I’m not saying you shouldn’t have
a firewall, but again, assume lateral movement
by an adversary who has already gotten past that
This isn’t easy, even by the standards I’ve set by warning that some of these things are hard. Do you encrypt all internal links? It’s a good idea, but imposes a lot of operational overhead, e.g., renewing certificates constantly.
- Red-team external software
- If you have the resources—and few small or medium organizations do—have a group that does nothing but bang on software before it’s deployed. Fuzz it, monitor it, and more, and see what breaks and what attempts are logged. Note that this is not the same as penetration-testing a company—that latter is testing how things are put together, rather than the individual pieces.
Some suggestions I can’t endorse wholeheartedly, even though in the abstract they’re good ideas.
- User education
- I’m all in favor of teaching people about the bad things that can happen, and in my experience you get a lot more cooperation from your user communityu when they know what the issues are. Don’t waste your time on rote rules ("pick strong passwords, with at least three upper-case letters, two heiroglyphics, at least one symbol from a non-human language, and a character from an ancient Etruscan novel"), or on meaningless, non-actionable tropes ("don’t click on suspicious links"). And don’t blame users for what are actually software failures—in an ideal world, there would be no risk in opening arbitrary attachments or visiting "dangerous" web sites.
- Proper, centralized access control
- If only—but the central security group has no idea who is really supposed to have permissions on every last server that’s part of a partnership with an external party.
Note how much of this requires good system administration and good management. If management won’t support necessary security measures, you’ll lose. And a proper sysadmin group, with enough budget and clout, is at the heart of everything.
I do want to thank everyone who contributed ideas, though of course many ideas where suggested by multiple people. The full Twitter thread is here.
Back when the world and I were young and I took my first international trip, I was told that I needed to obtain and carry an International Certificate of Vaccination (ICV).
With a bit of luck, we’ll have a vaccine soon for Covid-19. It strikes me as quite likely that soon after general availability, many countries will require proof of vaccination before you can enter, a requirement that might especially apply to Americans, given the disease rates here. But there’s a problem: how will border guards know if the certificate is genuine?
This is a very odd time in the U.S. Far too many people think that the disease is a hoax, and resist requirements to wear masks. More seriously, there are fraudulent "Freedom to Breathe Agency" cards that purport to exempt the bearer from mask-wearing requirements. It does not take any particular stretch of the imagination to suspect that we’ll see an outbreak of fake ICVs.
The obvious answer is some sort of unforgeable digital credential. Should this be a separate government-run database? More precisely, there would be many of them, run by governments around the world. Should WHO run it? How do these databases get populated? Are they linked to folks’ electronic health records? Employers and schools are likely to want similar verification—but how much of a person’s medical records should they have access to? (My university is going to require flu shots for all students.)
There are a lot of ways to get this wrong, including not doing anything to ensure that such certificates are genuine. But there are also risks if it’s done incorrectly. Are people working on this issue?
If you read this blog, you’ve probably heard by now about the massive Twitter hack. Briefly, many high-profile accounts were taken over and used to tweet scam requests to send Bitcoins to a particular wallet, with the promise of double your money back. Because some of the parties hit are sophisticated and security-aware, it seems unlikely that the attack was a straightforward one directly on these accounts. Speculation is that a Twitter administrative account was compromised, and that this was used to do the damage.
The notion is plausible. In fact, that’s exactly what happened in 2009. The result was a consent decree with the Federal Trade Commission. If that’s what has happened again, I’m sure that the FTC will investigate.
Again, though, at this point I do not know what happened. As I’ve written, it’s important that the community learn exactly what happened. Twitter is a sophisticated company; was the attack good enough to evade their defenses? Or did they simply drop their guard?
Jack Dorsey, the CEO of Twitter, tweeted
Twitter has become a crucial piece of the communications infrastructure; it’s even used for things like tornado alerts.
A few months ago, there was a lot of discussion that despite its claims, Zoom did not actually offer end-to-end encryption. They’re in the process of fixing that, which is good, but that raises a deeper question: why trust their code? (To get ahead of myself, this blog post is not about Zoom.)
In April, I wrote
As shown by Citizen Lab, Zoom’s code does not meet that definition:If Zoom has the key but doesn’t abuse it, there isn’t a problem, right?
By default, all participants’ audio and video in a Zoom meeting appears to be encrypted and decrypted with a single AES-128 key shared amongst the participants. The AES key appears to be generated and distributed to the meeting’s participants by Zoom servers.
Zoom has the key, and could in principle retain it and use it to decrypt conversations. They say they do not do so, which is good, but this clearly does not meet the definition [emphasis added]: no third party, even the party providing the communication service, has knowledge of the encryption keys.”
Let’s fast-forward to when they deploy true end-to-end encryption. Why do we trust their code not to leak the secret key? More precisely, what is the difference between the two scenarios? If they’re honest and competent, the central site won’t leak the key in today’s setup, nor will the end systems in tomorrow’s. If they’re not, either scenario is problematic. What is the difference? True end-to-end feels more secure, but why?
The answer, I think, is illustrated by the Lavabit saga:
The federal agents then claimed that their court order required me to surrender my company’s private encryption keys, and I balked. What they said they needed were customer passwords – which were sent securely – so that they could access the plain-text versions of messages from customers using my company’s encrypted storage feature.(Btw, Edward Snowden was the target of the investigation.) Lavabit was a service that was secure—until one day, it wasn’t. Its security properties had changed.
I call this the "trust binding" problem. That is, at a certain point, you decide whether to trust something. In the the two scenarios I described at the start, the trust decision has to be made every time you interact with the service. Maybe today, the provider is honest and competent; tomorrow, it might not be, whether due to negligence or compulsion by some government. By contrast, when the essential security properties are implemented by code that you download once, you only have to make your decision once—and if you were right to trust the provider, you will not suddenly be in trouble if they later turn incompetent or dishonest, or are compelled by a government to act against your interests.
Put another way, a static situation is easier to evaluate than a dynamic one. If a system was secure, it will remain secure, and you don’t have to revisit your analysis.
Of course, it cuts both ways: systems are often insecure or otherwise buggy as shipped, and it’s easier for the vendor to fix things in a dynamic environment. Furthermore, if you ever install patches for a static environment you have to make the trust decision again. It’s the same as with the dynamic options, albeit with far fewer decisions.
Which is better, then? If the vendor is trustworthy and you don’t face a serious enemy, dynamic environments are often better: bugs get fixed faster. That’s why Google pushes updates to Chromebooks and why Microsoft pushes updates to consumer versions of Windows 10. But if you’re unsure—well, static situations are easier to analyze. Just be sure to get your analysis right.
Last year, I wondered if Facebook was going to use metadata to detect possible misbehavior online. We now know that the answer is yes. Read the full article for details, but briefly, they’re using (of course) machine learning and features like message time, communication patterns compared with the social graph, etc. To give just one example, they can look for “an adult sending messages or friend requests to a large number of minors” as a possible sign of a pedophile.
Some of the effort is education: “Our new safety notices also help educate people on ways to spot scams or imposters and help them take action to prevent a costly interaction.” I have grave doubts that that will work well, but we’ll see.
The hard part is going to be knowing how well it works. How will they know about false positives, situations where they flag a communication as suspect but it really isn’t? For that matter, how will they know about true positives? I can think of possible answers, but I’ll wait for further details.
Not at all to my surprise, people are reporting trouble with an online site for applying for loans under the Paycheck Protection Program (PPP). This is software that was built very quickly but is expected to cope with enormous loads. That’s a recipe for disaster, and it’s less surprising that there have been problems than it would be for there to be no problems.
We saw this happen with the sign-up for Obamacare; I wrote similar things then. Software development is hard; rapidly building any system, let alone one that has to run at scale, was, is, and will likely remain difficult. That’s independent of which political party is in charge at the time.
That we can’t do software engineering perfectly doesn’t mean that it can’t be done better. Healthcare.gov was rescued fairly quickly, when the administration brought in a management team that understood software engineering. The same applies here: competent management matters.
As anyone reading this blog assuredly knows, the world is in the grip of a deadly pandemic. One way to contain it is contact-tracing: finding those who have been near infected people, and getting them to self-quarantine. Some experts think that because of how rapidly newly infected individuals themselve become contagious, we need some sort of automated scheme. That is, traditional contact tracing is labor-intensive and time-consuming, time we don’t have here. The only solution, they say, is to automate it, probably by using the cell phones we all carry.
Naturally, privacy advocates (and I’m one) are concerned. Others, though point out that we’ve been sharing our location with advertisers; why would we not do it to save lives? Part of the answer, I think, is that people know they’ve been misled, so they’re more suspicious now.
As Joel Reidenberg and his colleagues have pointed out, privacy policies are ambiguous, perhaps deliberately so. One policy they analyzed said
“May”? Do you collect it or not? "As necessary"? “To administer”? What do those mean?
- “Depending on how you choose to interact with the Barnes & Noble enterprise, we may collect personal information from you…”
- “We may collect personal information and other information about you from business partners, contractors and other third parties.”
- “We collect your personal information in an effort to provide you with a superior customer experience and, as necessary, to administer our business”
The same lack of clarity is true of location privacy policies. The New York Times showed that some apps that legitimately need location data are actually selling it, without making that clear:
Society is paying the price now. The lack of trust built up by 25 years of opaque web privacy policies is coming home to roost. People are suspicious of what else will be done with their data, however important the initial collection is.
Can this be salvaged? I don’t know; trust, once forfeited, is awfully hard to regain. At a minimum, there need to be strong statutory guarantees:
- The information collected will only be used for contact tracing;
- It will not be available to anyone else, including law enforcement, for any reason whatsoever;
- There are both criminal and civil penalties for unauthorized collection or use of such data, e.g., by a store;
- There is a private right of action as well as city, state, and Federal enforcement;
- That class action suits to enforce this are permitted, regardless of terms and conditions requiring arbitration.
I don’t know if even this will suffice—as I said, it’s hard to regain trust. But passing a strong Federal privacy law might make things easier when the next pandemic hits—and from what I’ve read, that’s only a matter of time.
(There’s a lot more to be said on this topic, e.g., should a tracking app be voluntary or mandatory? The privacy advocate in me says yes; the little knowledge I have of epidemiology makes me think that very high uptake is necessary to gain the benefits.)