August 2018
The Economics of Hacking an Election (7 August 2018)
Foldering (8 August 2018)
German Cryptanalytic Attacks on the British World War II "TYPEX" Machine (24 August 2018)

The Economics of Hacking an Election

7 August 2018

There have been many news stories of late about potential attacks on the American electoral system. Which attacks are actually serious? As always, the answer depends on economics.

There are two assertions I’ll make up front. First, the attacker—any attacker—is resource-limited. They may have vast resources, and in particular they may have more resources than the defenders—but they’re still limited. Why? They’ll throw enough resources at the problem to solve it, i.e., to hack the election, and use anything left over for the next problem, e.g., hacking the Brexit II referendum… There’s always another target.

Second, elections are a system. That is, there are multiple interacting pieces. The attacker can go after any of them; the defender has to protect them all. And protecting just one piece very well won’t help; after all, "you don’t go through strong security, you go around it." But again, the attacker has limited resources. Their strategy, then, is to find the greatest leverage, the point to attack that costs the defenders the most to protect.

There are many pieces to a voting system; I’ll concentrate on the major ones: the voting machines, the registration system, electronic poll books, and vote-tallying software. Also note that many of these pieces can be attacked indirectly, via a supply chain attack on the vendors.

There’s another point to consider: what are the attacker’s goals? Some will want to change vote totals; others will be content with causing enough obvious errors that no one believes the results—and that can result in chaos.

The actual voting machines get lots of attention. That’s partly a hangover from the 2000 Bush–Gore election, where myriad technological problems in Florida’s voting system (e.g., the butterfly ballot in Palm Beach County and the hanging chads on the punch card voting machines) arguably cost Gore the state and hence the presidential election.

And purely computerized (DRE—Direct Recording Electronic) voting machines are indeed problematic. They make mistakes. If there’s ever a real problem, there’s nothing to recount. It’s crystal-clear to virtually every computer scientist who has studied the issue that DRE machines are a bad idea. But: if you want to change the results of a nation-wide election or set of elections in the U.S., going after DRE machines is probably the wrong idea. Why not? Because it’s too expensive.

There are many different election administrations in the U.S.: about 10,000 of them. Yes, sometimes an entire state uses the same type of machine—but each county administers its own machines. Storing the voting machines? Software updates? Done by the county. Progamming the ballot? Done by the county. And if you want to attack them? Yup—you have to go to that county. And voting machines are rarely, if ever, connected to the Internet, which means that you pretty much need physical presence to do anything nasty.

Now, to be sure, if you are at the polling place you may be able to do really nasty things to some voting machines. But it’s not an attack that scales well for the attacker. It may be a good way to attack a local election, but nothing larger. A single Congressional race? Maybe, but let’s do a back-of-the-envelope calculation. The population of the U.S. is about 325,000,000. That means that each election area has about 32,500 people. (Yes, I know it’s very non-uniform. This is a back-of-the-envelope calculation.) There are 435 representatives, so each one has about 747,000 constituents, or about 75 election districts. (Again: back of the envelope.) So: you’d need a physical presence in seven different counties, and maybe many precincts in each county to tamper with the machines there. As I said, it’s not an attack that scales very well. We need to fix our voting machines—after all, think of Florida in 2000—but for an attacker who wants to change the result of a national election, it’s not the best approach.

There’s one big exception: a supply chain attack might be very feasible for a nation-state attacker. There are not many vendors of voting equipment; inserting malware in just a few places could work very well. But there’s a silver lining in that cloud: because there are many fewer places to defend than 50 states or 10,000 districts, defense is much less expensive and hence more possible—if we take the problem seriously.

And don’t forget the chaos issue. If, say, every voting machine in a populus county of a battleground state showed a preposterous result—perhaps a 100% margin for some candidate, or 100 times as many votes cast as there are registered voters in the area—no one will be believe that that result is valid. What then? Rerun the voting in just that county? Here’s what the Constitution says:

The Congress may determine the Time of chusing the Electors, and the Day on which they shall give their Votes; which Day shall be the same throughout the United States.

The voter registration systems are a more promising target for an attacker. While these are, again, locally run, there is often a statewide portal to them. In fact, 38 states have or are about to have online voter registration.

In 2016, Russia allegedly attacked registration systems in a number of states. Partly, they wanted to steal voter information, but an attacker could easily delete or modify voter records, thus effectively disenfranchising people. Provisional ballots? Sure, if your polling place has enough of them, and if you and the poll workers know what to do. I’ve been a poll worker. Let’s just say that handling exceptional cases isn’t the most efficient process. And consider the public reaction if many likely supporters (based on demographics) of a given candidate are the ones who are disproportionately deleted. (Could the attackers register phony voters? Sure, but to what end? In-person voter fraud is exceedingly rare; how many times can Boris and Natasha show up to vote? Again, that doesn’t scale. That’s also why requiring an ID to vote is solving a non-problem.)

There’s another point. Voting software is specialized; it’s attack surface should be low. It’s possible to get that wrong, as in some now-decertified Virginia voting machines, and there’s always the underlying operating system; still, if the machines aren’t networked, during voting the only exposure should be via the voting interface.

A lot of registration software, though, is a more-or-less standard web platform, and is therefore subject to all of the risks of any other web service. SQL injection, in particular, is a very real risk. So an attack on the registration system is not only more scalable, it’s easier.

Before the election, voter rolls are copied to what are known as poll books. Sometimes, these are paper books; other places use electronic ones. The electronic ones are networked to each other; however, they are generally not connected to the Internet. If that networking is set up incorrectly, there can be risks; generally, though, they’re networked on a LAN. That means that you have to be at the polling place to exploit them. In other words, there’s some risk, but it’s not much greater than the voting machines.

There’s one more critical piece: the vote-tallying software. Tallies from each precinct are transmitted to the county’s election board; there may be links to the state, to news media, etc. In other words, this software is networked and hence very subject to attack. However: this is used for the election night count; different procedures can be and often are used for the official canvas. And even without attacks, many things can go wrong:

In Iowa, a hard-to-read fax from Scott County caused election officials initially to give Vice President Gore an extra 2,006 votes. In Outagamie County, Wis., a typo in a tally sheet threw Mr. Bush hundreds of votes he hadn’t won.
But: the ability to do a more accurate count the second time around depends on there being something different to count: paper ballots. That’s what saved the day in 2000 in Bernalillo County, New Mexico. The problem: “The paper tallies, resembling grocery-store receipts, seemed to show that many more ballots had been cast overall than were cast in individual races. For example, tallies later that night would show that, of about 38,000 early ballots cast, only 25,000 were cast for Mr. Gore or Mr. Bush.” And the cause? Programming the vote-counting system:
As they worked, Mr. Lucero’s computer screen repeatedly displayed a command window offering a pull-down menu. From the menu, the two men should have clicked on "straight party." Either they didn’t make the crucial click, or they did and the software failed to work. As a result, the Accu-Vote machines counted a straight-party vote as one ballot cast, but didn’t distribute any votes to each of the individual party candidates.

To illustrate: If a voter filled in the oval for straight-party Democrat, the scanner would record one ballot cast but wouldn’t allocate votes to Mr. Gore and other Democratic candidates.

Crucially, though, once they fixed the programming they could retally those paper ballots. (By the way, programming the tallying computer can itself be complex. Bernalillo County, which had a population of 557,000 then, required 114 different ballots.)

There’s a related issue: the systems that distribute votes to the world. Alaska already suffered such an attack; it could happen elsewhere, too. And it doesn’t have to be via hacking; a denial of service attack could also do the job of causing chaos.

The best way to check the ballot-counting software is risk-limiting audits. A risk-limiting audit checks a random subset of the ballots cast. The closer the apparent margin, the more ballots are checked by hand. "Risk-limiting audits guarantee that if the vote tabulation system found the wrong winner, there is a large chance of a full hand count to correct the results." And it doesn’t matter whether the wrong count was due to buggy software or an attack. In other words, if there is a paper trail, and if it’s actually looked at, via either a full hand-count or a risk-limiting audit, the tallying software isn’t a good target for an attacker. One caveat: how much chaos might there be if the official count or the recount deliver results significantly different than the election night fast count?

There’s one more point: much of the election machinery, other than the voting machines themselves, are an ordinary IT installation, and hence are subject to all of the security ills that any other IT organization can be subject to. This specifically includes things like insider attacks and ransomware—and some attackers have been targeting local governments:

Attempted ransomware attacks against local governments in the United States have become unnervingly common. A 2016 survey of chief information officers for jurisdictions across the country found that obtaining ransom was the most common purpose of cyberattacks on a city or county government, accounting for nearly one-third of all attacks.
The threat of attacks has induced at least one jurisdiction to suspend online return of absentee ballots. They’re wise to be cautious—and probably should have been that cautious to start.

Again, elections are complex. I’ve only covered the major pieces here; there are many more ways things can go wrong. But of this sample, it’s pretty clear that the attackers’ best target is the registration system. (Funny, the Russians seemed to know that, too.) Actual voting machines are not a great target, but the importance of risk-limiting audits (even if the only problem is a close race) means that replacing DRE voting machines with something that provides a paper trail is quite important. The vote-counting software is even less interesting if proper audits are done, though don’t discount the utility to some parties of chaos and mistrust.

Acknowledgments:Many thanks to Joseph Lorenzo Hall, Avi Rubin, and Matt Blaze for many helpful comments on this blog post.


Update: No sooner did I write about how impossible results could lead to chaos than this story appeared about DRE machines in Georgia: "[i]n Habersham County’s Mud Creek precinct, … 276 registered voters managed to cast 670 ballots". There were other problems, too. I suspect bugs rather than malice—but we don’t really know yet.
https://www.cs.columbia.edu/~smb/blog/2018-08/2018-08-07.html