In Memoriam: Frederick P. Brooks, Jr. --- Personal Recollections

18 November 2022

Fred Brooks passed away on November 17. He was a giant in computer science and a tremendous inspiration to many of us.

Brooks is famous for many things. Many people know him best as the author of The Mythical Man-Month, his musings on software engineering and why it’s so very hard. Some of his prescriptions seem quaint today—no one these days would print out documentation on microfiche every night to distribute to developers—but his observations about the problems of development remain spot-on. But he did so much more.

He started his career at IBM. One of his early assignments was to NSA; among other things, when back at IBM he worked on the design of Harvest, the character-processing auxiliary processor to the IBM 7030. (When I took computer architecture from him, I learned to recognize some characteristics that made for a good cryptanalytic machine. He, of course, was working back in character cipher days, but it’s remarkable how many of those same characteristics are useful for attacking modern ciphers.)

He was also a lead on a failed project, the IBM 8000 series. He tried to resign from IBM after it failed; Watson replied, "I just spent a billion dollars educating you; I’m not letting you go now!"

He then headed the project that designed and built the IBM S/360 series of mainframes. It was an audacious concept for the time—five different models, with vastly different prices and performance characteristics, but all sharing (essentially) the same instruction set. Furthermore, that instruction set was defined by the architecture, not by some accident of wiring. (Fun fact: the IBM 7094 (I think it was the 7094 and not the 7090) had a user-discovered "Store Zero" instruction. Some customers poked around and found it, and it worked. The engineers checked the wiring diagram and confirmed that it should, so IBM added it to the manual.) There were a fair number of interesting aspects to the architecture of the S/360, but Brooks felt that one of the most important contributions was the 8-bit byte (and hence the 32-bit word).

He went on to manage the team that developed OS/360. He (and Watson) regarded that as a failure; they both wondered why, since he had managed both. That’s what led him to write The Mythical Man-Month.

He left IBM because he felt that he was Called to start a CS department. This was an old desire of his; he always knew that he was going to teach some day. (Brooks was devoutly religious. Since I don’t share his beliefs, I can’t represent them properly and hence will say nothing more about them. I will note that in my experience at least, he respected other people’s sincere beliefs, even if he disagreed.)

I started grad school at Chapel Hill a few years later. Even though he was the department chair, with many demands on his time—he had a row of switch-controlled clocks in his office, so that he could track how much time he spent on teaching, research, adminstration, etc.—he taught a lot. I took four courses from him: software engineering (the manuscript for The Mythical Man-Month was our text!), computer architecture, seminar on professional practice, and seminar on teaching. I still rely on a lot of what I learned from him, lo these many years ago.

Late one spring, he asked me what my summer plans were. "Well, Dr. Brooks, I’d like to teach." (I’d never taught before but knew that I wanted to, and in that department in that time, PhD students had to teach a semester of introductory programming, with full classroom control. I ended up teaching it four times, because I really liked teaching.)

Brooks demurred: "Prof. X’s project is late and has deliverables due at the end of the summer; I need you to work on it."

"Dr. Brooks, you want to add manpower to a late project?" (If you’ve read The Mythical Man-Month, you know the reference—and if you haven’t read it yet, you should.)

He laughed, told me that this was a special case, and that I should do it and in the fall I could have whatever assistantship I wanted. And he was 100% correct: it was a special case, where his adage didn’t apply.

Brooks was dedicated to education and students. When it came to faculty hiring and promotion, he always sought graduate student input in an organized setting. He called those meetings SPQR, where the faculty was the Senate. In one case that I can think of, I suspect that student input seriously swayed and possibly changed the outcome.

He had great confidence in me, even at low points for me, and probably saved my career. At one in my odd grad school sojourn, I needed to take a year off. I had a faculty job offer from another college, but Brooks suspected, most likely correctly, that if I left, I’d never come back and finish my PhD. He arranged for me to be hired in my own department for a year—I went from PhD student to faculty member and back to student again. This was head-spinning, and I didn’t adapt well at first to my faculty status, especially in the way that I dressed. Brooks gently commented to me that it was obvious that I still thought of myself as a student. He was right, of course, so I upgraded my wardrobe and even wore a tie on days that I was teaching. (I continued to dress better on class days until the pandemic started. Maybe I’ll resume that practice some day…)

At Chapel Hill, he switched his attention to computer graphics and protein modeling. He’d acquired a surplus remote manipulator arm; the idea was that people could use it to "grab" atoms and move them, and feel the force feedback from the varying charge fields. From there, it was a fairly natural transition to some of the early work in VR.

Two more anecdotes and then I’ll close. Shortly after the department had gotten its first VAX computers, a professor who didn’t understand the difference between real memory and virtual memory fired up about 10 copies of a large program. The machine was thrashing so badly that no one could get anything done, including, of course, that professor. I was no longer an official sysadmin but I still had the root password. I sent a STOP signal to all but one copy of the program, then very nervously sent an email to that professor and Brooks. Brooks’ reply nicely illustrates his management philosophy: "if someone is entrusted with the root password, they have not just the right but the responsibility to use it when necessary."

My last major interaction with Brooks while I was a student was when I went to him to discuss two job offers, from IBM Research and Bell Labs (not Bell Labs Research; that came a few years later). He wouldn’t tell me which to pick; that wasn’t his style, though he hinted that Bell Labs might be better. But his advice is one I often repeat to my students: take a sheet of paper, divide it into two columns, and as objectively as you can list the pluses and minuses of each place. Then sleep on it and in the morning go with your gut feeling.

Frederick P. Brooks, Jr.: May his memory be for a blessing.

Attacker Target Selection

5 July 2021

There’s a bit of a debate going on about whether the Kaseya attack exploited a 0-day vulnerability. While that’s an interesting question when discussing, say, patch management strategies, I think it’s less important to understanding attackers’ thinking than target selection. In a nutshell, when it comes to target selection, the attackers have outmaneuvered defenders for almost 30 years.

In early 1994, CERT announced that attackers were planting network eavesdropping tools—"sniffers"—on SunOS hosts, and were using these to collect passwords. What was known in the security community but not mentioned in the CERT advisory was how strategically these sniffers were placed: on major Ethernet segments run by ISPs. Back then, in the era before switched Ethernet was the norm, an Ethernet network was a single domain; every host could see every packet. That was great for network monitoring—and great for surreptitious eavesdropping. All security people understood both halves of that equation, but the attackers realized that backbone links offered far more opportunities to collect useful passwords than ordinary sites’ networks.

Fast-forward a decade to 2003, when the Sobig.f virus made its appearance. In the words of a retrospective look, "[Tt]he whole Sobig family was incredibly significant because that was the point where spam and viruses converged." Again, that hacked computers could be used for profit wasn’t a new idea, but few defenders realized until too late that the transition had taken place in the real world.

A few years later, credit card payment processors were hit, most famously Heartland Payment Systems. This is an industry segment most people didn’t realize existed—weren’t charges and payments simply handled by the banks? But the attackers knew, and went after such companies.

The current cast of malware operators are again ahead of the game, by going after cyber-infrastructure companies such as SolarWinds and Kaseya. This gives them leverage: go after one company; penetrate thousands. The payoff might be intelligence, as in the SolarWinds hack, or it might be financial, as with Kaseya. And the ransomware perpetrators are apparently being very strategic:

Cyber-liability insurance comes with a catch, however: It may make you more vulnerable to a ransomware attack. When cyber-criminals target cyber-insurance companies, they then have access to a list of their insured clients, which cyber-criminals can then use to their advantage to demand a ransom payment that mirrors the limit of a company’s coverage.
Attackers even exploit timing:
DarkSide, which emerged last August, epitomized this new breed. It chose targets based on a careful financial analysis or information gleaned from corporate emails. For instance, it attacked one of Tantleff’s clients during a week when the hackers knew the company would be vulnerable because it was transitioning its files to the cloud and didn’t have clean backups.

The challenge for defenders is to identify what high-value targets might be next. A lot of defense is about protecting what we think are high-value systems, but we’re not always assessing value in the same way as the attackers do. What’s next?

Where Did "Data Shadow" Come From?

26 June 2021

Anyone who works in privacy is familiar with the term "data shadow": the digital record created by our transactions, our travels, our online activities. But where did the phase come from? Who used it first?

A number of authors have attributed it to Alan Westin, whose seminal book Privacy and Freedom (largely a report on the work of the Committee on Science and Law of the Association of the Bar of the City of New York) set the stage for most modern discussions of privacy. I followed the lead of some other authors, e.g., Kevin Keenan’s 2005 book Invasion of Privacy: A Reference Handbook in saying that the phrase appeared in Privacy and Freedom. Simson Garfinkel’s 2000 book Database Nation is more circumspect, attributing it to Westin in the 1960s, but not citing a particular work. However, repeated searches of Westin’s works do not turn up that phrase.

Other authors point to different origins. Harvey Choldin says that the term is used by German sociologists, and cites an interview. Can we push it back further?

Google Scholar turned up a 1983 reference in Politics on a Microchip by Edwin Black. A Google Ngram search by Garfinkel shows nothing before 1976. But a more complete Google search turned up a 1974 usage by Kirstein Aner [sic] in New Scientist. They quote her as saying

Every single person will be closely followed all his life by his own data shadow where everything he has ever done, learnt, bought, achieved, or failed to achieve is expressed in binary numbers for eternity. His shadow will often be taken for himself and we will have to pay for all the inaccuracies as if they were his own; he will never know who is looking at is shadow or why.
Kerstin Anér was a member of the Swedish legislature.

At that point, I asked a Swedish friend for help. He found some useful references to "dataskugga". Page 231 of Kretslopp Av Data says, via Google Translate:

By the term "data shadow" Kerstin Anér meant "the two-dimensional image of the individual that is evoked through one or more data registers" or "the series of crosses in different boxes that a living person is transformed into in the data registers and which tends by those in power to be treated as more real and more interesting than the living man himself.
The footnote for the quote points to Anér’s 1975 book Datamakt (Computer Power) and says (again via Google Translate):
Anér referred to in the article [unsigned], “Data law must protect privacy. But we are still sold as issues ”, Dagens Nyheter, 13 July 1973; Ten years later, the term was still in use: Calle Hård, "The story of the data shadow 600412– 5890", Aftonbladet, 11 March 1982.

There was one more tantalizing reference. Google Books turned up a reference to "data shadow" in OECD Informatics with a possible date of 1971. When I finally obtained access to Volume 12, I found that it was from 1978—in an article by Anér.

That’s as far back as I’ve been able to trace it: Kerstin Anér, probably in 1972 or 1973. Does anyone have any earlier uses, in any language?

Acknowledgments: Patrik Fältström did the Swedish research. Tarah Wheeler and Matthew Nuding tracked down the OECD volume. Simson Garfinkel also did assorted web searches.

Vaccine Scheduling, APIs, and (Maybe) Vaccination Passports

2 April 2021

As most Americans who have received a Covid-19 vaccination—or who have tried to schedule one—know, attempting to make an appointment is painful. Here is a thread describing the ordeal in Maryland, but it’s broadly similar elsewhere. In New York City, where I live, vaccines are administered by the city, the state, several different pharmacy chains, countless local pharmacies, some major medical centers, and more, and you have to sign up with each of them separately to see if any appointments are available. Here, the situation was so crazy that unofficial meta-sites have sprung up. The New York City Council passed a law requiring the city to create an official central web site.

It didn’t have to be this way, if proper planning and development had started last year, at the national level, but of course it didn’t. The answer is simple, at least conceptually: standardized interfaces, sometimes called APIs (application program interfaces) or network protocols.

Every large medical provider has its own scheduling system. These exist, they’re complex, they may be linked to patient records, staff availability, etc. That includes pharmacy chains—I had to use such a system to schedule my last flu shot. It isn’t reasonable to ask such organizations to replace their existing scheduling systems with a new one. That, however isn’t necessary. It is technically possible to design an API that a central city or county scheduling web page could implement. Every vaccination provider in the jurisdiction could use their half of the API to interface between their own scheduling systems and the central site. The central site could be designed to handle load, using well-understood cloud computing mechanisms. Indepdendent software vendors could develop scheduling packages for non-chain pharmacies that don’t have their own. It all could have been very simple for people who want to be vaccinated.

Mind you, designing such an API is not a trivial task. Think of all of the constraints that have to be communicated: age, job, residence, place of work, what eligibility categories a provider will accept, and more. There need to be mechanisms for both individuals to schedule shots for themselves or their families, and for operators of city-run phone banks to make appointments for the many people who don’t have computer access of their own, or who can’t cope with a web site that may have been poorly designed. You need logins, "I forgot my password", wait lists, people who can show on short notice lists, cancellation, queries, and more. Nevertheless, this is something that the tech industry knows how to do. (I used to be part of the leadership of the Internet Engineering Task Force, which standardizes many of the network protocols we all use. That’s why I, on a Mac, can send email via a Linux web server to a Windows user, which they may view using a platform-independent web browser: there are standardized ways to do these things. Protocols can be simple or complex, but they’re routinely designed.)

And once you have an API, software developers have to implement it, for central sites, for small providers, and as interfaces to existing scheduling systems. That latter is itself likely to be a complex task, since you’re touching a large, probably messy codebase. Nevertheless, this is all routine in the software business.

All of this could have been done. It should have been done; the situation—high demand for a limited resource, with various priority levels—was predictable long ago. As soon as it was apparent that the world was confronting a deadly pandemic that could be dealt with by a vaccine—that SARS-CoV-2 had that spreading potential was understood by some in January 2020, and Moderna had its vaccine designed two days after they received the virus’ genome sequence—the Department of Health and Human Services should have convened an API design group. In parallel with that, developers could have started on their own software while waiting for the API specs to be finalized. I’ll take it a step further: we should still go ahead with this effort. We may need it for booster shots or to cope with new variants; we’ll certainly need it for the next pandemic.

And we may need something similar now for vaccination passports. There is not yet a consensus on their desirability; that said, many places already require them. Almost certainly, the little card you got from your provider is inadequate in the long run. You’ll have to request the standardized version, either on paper or for the many different apps you’ll likely need—and that many need a similar API.

Security Priorities

25 February 2021

In the wake of recent high-profile security incidents, I started wondering: what, generally speaking, should an organization’s security priorities be? That is, given a finite budget—and everyone’s budget is finite—what should you do first? More precisely, what security practices or features will give you the most protection per zorkmid? I suggested two of my own, and then asked my infosec-heavy Twitter feed for suggestions.

I do note that I’m not claiming that these are easy; indeed, many are quite hard. Nevertheless, they’re important.

I started with my own top choices.

Install patches
Few organizations are hit by zero-days. That doesn’t mean it can’t happen, and of course if you’re the target of a major intelligence agency the odds go up, but generally speaking, zero-days are far from the highest risk. Installing security patches promptly is a good defense.

Use 2FA
Good two-factor authentication—that often means FIDO2—is basically a necessity for access from the outside. Ordinary passwords can be phished, captured by keystroke loggers, guessed, and more. While there are many forms of 2FA, FIDO2 ties the session to the far side’s identity, providing very strong protection against assorted attacks.

This is harder to pull off internally—some form of single sign-on is a necessity; no one wants to type their password and insert their security key every time they need to open a file that lives on a server.

The next suggestion is one I should have thought of but didn’t; that said, I wholeheartedly agree with it.

Know what computers you have, and what software they run. If you don’t know what you’ve got (and who owns it), you don’t know whom to alert when a security vulnerability pops up. Consider, for example, this new hole in VMware. Do you know how many VMware servers you have? If you ran the corporate security group and saw that alert, could you rapidly notify all of the responsible system administrators? Could you easily track which servers were upgraded, and when?

The next set of answers have to do with recovery: assume that you will suffer some penetration. Now what?

Have good backups, and make sure that at least one current-enough set is offline, as protection against ransomware.

I would add: test recovery. I’ve seen far too many situations where backups were, for some reason, incorrect or unusable. If you don’t try them out, you have no reason to think that your backups are actually useful for anything.

If you don’t have good logs, you won’t know what happened. You may not even know if anything has happened.

What should you log? As much as you can—disk space is cheap. Keep logs on a locked-won log server, and not on a machine that might otherwise be targeted. Logs from network elements, such as switches and routers (hint: Netflow), are especially valuable; ditto DNS logs. These show who has connected where, invaluable if you’re trying to trace the path of an intruder.

Your internal network should be segmented. The initial point of entry of an attacker is probably not their actual goal, which implies the need for lateral movement. Internal segmentation is a way to hinder and perhaps detect such movements.

Note that segmentation is not just firewalls, though those are probably necessary. You should also have separate authentication domains. Look at what happened to Maersk in the NotPetya attack:

Maersk’s 150 or so domain controllers were programmed to sync their data with one another, so that, in theory, any of them could function as a backup for all the others. But that decentralized backup strategy hadn’t accounted for one scenario: where every domain controller is wiped simultaneously. “If we can’t recover our domain controllers,” a Maersk IT staffer remembers thinking, “we can’t recover anything.”

After a frantic search that entailed calling hundreds of IT admins in data centers around the world, Maersk’s desperate administrators finally found one lone surviving domain controller in a remote office—in Ghana. At some point before NotPetya struck, a blackout had knocked the Ghanaian machine offline, and the computer remained disconnected from the network. It thus contained the singular known copy of the company’s domain controller data left untouched by the malware—all thanks to a power outage. “There were a lot of joyous whoops in the office when we found it,” a Maersk administrator says.

Intrusion detection
Monitor for attacks. There have been far too many cases—think Sony and Equifax—where the attackers were inside for quite a while, exfiltrated a lot of data, and weren’t detected. Why not? Note that your logs are a very useful source of information here—if you actually have the tools to look at them automatically and frequently.

Harden machines for outside life
That is, assume that some machines are going to be seriously attacked as if it were on the open Internet. I’m not saying you shouldn’t have a firewall, but again, assume lateral movement by an adversary who has already gotten past that barrier.

This isn’t easy, even by the standards I’ve set by warning that some of these things are hard. Do you encrypt all internal links? It’s a good idea, but imposes a lot of operational overhead, e.g., renewing certificates constantly.

Red-team external software
If you have the resources—and few small or medium organizations do—have a group that does nothing but bang on software before it’s deployed. Fuzz it, monitor it, and more, and see what breaks and what attempts are logged. Note that this is not the same as penetration-testing a company—that latter is testing how things are put together, rather than the individual pieces.

Some suggestions I can’t endorse wholeheartedly, even though in the abstract they’re good ideas.

User education
I’m all in favor of teaching people about the bad things that can happen, and in my experience you get a lot more cooperation from your user communityu when they know what the issues are. Don’t waste your time on rote rules ("pick strong passwords, with at least three upper-case letters, two heiroglyphics, at least one symbol from a non-human language, and a character from an ancient Etruscan novel"), or on meaningless, non-actionable tropes ("don’t click on suspicious links"). And don’t blame users for what are actually software failures—in an ideal world, there would be no risk in opening arbitrary attachments or visiting "dangerous" web sites.

Proper, centralized access control
If only—but the central security group has no idea who is really supposed to have permissions on every last server that’s part of a partnership with an external party.

Note how much of this requires good system administration and good management. If management won’t support necessary security measures, you’ll lose. And a proper sysadmin group, with enough budget and clout, is at the heart of everything.

I do want to thank everyone who contributed ideas, though of course many ideas where suggested by multiple people. The full Twitter thread is here.

Tags: security

Covid-19 Vaccinations, Certificates, and Privacy

13 August 2020

Back when the world and I were young and I took my first international trip, I was told that I needed to obtain and carry an International Certificate of Vaccination (ICV).

Picture of an ICV
They’re not used much these days, except for travel to yellow fever areas, but they’re still around.

With a bit of luck, we’ll have a vaccine soon for Covid-19. It strikes me as quite likely that soon after general availability, many countries will require proof of vaccination before you can enter, a requirement that might especially apply to Americans, given the disease rates here. But there’s a problem: how will border guards know if the certificate is genuine?

This is a very odd time in the U.S. Far too many people think that the disease is a hoax, and resist requirements to wear masks. More seriously, there are fraudulent "Freedom to Breathe Agency" cards that purport to exempt the bearer from mask-wearing requirements. It does not take any particular stretch of the imagination to suspect that we’ll see an outbreak of fake ICVs.

The obvious answer is some sort of unforgeable digital credential. Should this be a separate government-run database? More precisely, there would be many of them, run by governments around the world. Should WHO run it? How do these databases get populated? Are they linked to folks’ electronic health records? Employers and schools are likely to want similar verification—but how much of a person’s medical records should they have access to? (My university is going to require flu shots for all students.)

There are a lot of ways to get this wrong, including not doing anything to ensure that such certificates are genuine. But there are also risks if it’s done incorrectly. Are people working on this issue?

Hot Take on the Twitter Hack

15 July 2020

If you read this blog, you’ve probably heard by now about the massive Twitter hack. Briefly, many high-profile accounts were taken over and used to tweet scam requests to send Bitcoins to a particular wallet, with the promise of double your money back. Because some of the parties hit are sophisticated and security-aware, it seems unlikely that the attack was a straightforward one directly on these accounts. Speculation is that a Twitter administrative account was compromised, and that this was used to do the damage.

The notion is plausible. In fact, that’s exactly what happened in 2009. The result was a consent decree with the Federal Trade Commission. If that’s what has happened again, I’m sure that the FTC will investigate.

Again, though, at this point I do not know what happened. As I’ve written, it’s important that the community learn exactly what happened. Twitter is a sophisticated company; was the attack good enough to evade their defenses? Or did they simply drop their guard?

Jack Dorsey, the CEO of Twitter, tweeted

Jack Dorsey tweet saying that they’ll share ’everything we can’
that they would share "everything we can". With all due respect, that doesn’t sound good enough. Other than minor details that would be useful as evidence, the security community really needs to know what went wrong—we can’t build proper defenses without that. (For that matter, I’ve even called for disclosure of near misses.)

Twitter has become a crucial piece of the communications infrastructure; it’s even used for things like tornado alerts.

Tweet noting that the National Weather Service couldn’t tweet a tornado alert
But even if it weren’t used for critical activities, it’s a major site—and the public deserves details on what went wrong.

Trust Binding

11 June 2020

A few months ago, there was a lot of discussion that despite its claims, Zoom did not actually offer end-to-end encryption. They’re in the process of fixing that, which is good, but that raises a deeper question: why trust their code? (To get ahead of myself, this blog post is not about Zoom.)

In April, I wrote

As shown by Citizen Lab, Zoom’s code does not meet that definition:

By default, all participants’ audio and video in a Zoom meeting appears to be encrypted and decrypted with a single AES-128 key shared amongst the participants. The AES key appears to be generated and distributed to the meeting’s participants by Zoom servers.

Zoom has the key, and could in principle retain it and use it to decrypt conversations. They say they do not do so, which is good, but this clearly does not meet the definition [emphasis added]: no third party, even the party providing the communication service, has knowledge of the encryption keys.”

If Zoom has the key but doesn’t abuse it, there isn’t a problem, right?

Let’s fast-forward to when they deploy true end-to-end encryption. Why do we trust their code not to leak the secret key? More precisely, what is the difference between the two scenarios? If they’re honest and competent, the central site won’t leak the key in today’s setup, nor will the end systems in tomorrow’s. If they’re not, either scenario is problematic. What is the difference? True end-to-end feels more secure, but why?

Let’s look at another scenario: encrypted email via a web browser. I’ll posit two implementations. In the first, the web site serves up custom JavaScript to do the decryption; in the second, there’s a browser plug-in that does the exact same thing. Again, the second version feels more secure, but why?

The answer, I think, is illustrated by the Lavabit saga:

The federal agents then claimed that their court order required me to surrender my company’s private encryption keys, and I balked. What they said they needed were customer passwords – which were sent securely – so that they could access the plain-text versions of messages from customers using my company’s encrypted storage feature.
(Btw, Edward Snowden was the target of the investigation.) Lavabit was a service that was secure—until one day, it wasn’t. Its security properties had changed.

I call this the "trust binding" problem. That is, at a certain point, you decide whether to trust something. In the the two scenarios I described at the start, the trust decision has to be made every time you interact with the service. Maybe today, the provider is honest and competent; tomorrow, it might not be, whether due to negligence or compulsion by some government. By contrast, when the essential security properties are implemented by code that you download once, you only have to make your decision once—and if you were right to trust the provider, you will not suddenly be in trouble if they later turn incompetent or dishonest, or are compelled by a government to act against your interests.

Put another way, a static situation is easier to evaluate than a dynamic one. If a system was secure, it will remain secure, and you don’t have to revisit your analysis.

Of course, it cuts both ways: systems are often insecure or otherwise buggy as shipped, and it’s easier for the vendor to fix things in a dynamic environment. Furthermore, if you ever install patches for a static environment you have to make the trust decision again. It’s the same as with the dynamic options, albeit with far fewer decisions.

Which is better, then? If the vendor is trustworthy and you don’t face a serious enemy, dynamic environments are often better: bugs get fixed faster. That’s why Google pushes updates to Chromebooks and why Microsoft pushes updates to consumer versions of Windows 10. But if you’re unsure—well, static situations are easier to analyze. Just be sure to get your analysis right.

Facebook, Abuse, and Metadata

24 May 2020

Last year, I wondered if Facebook was going to use metadata to detect possible misbehavior online. We now know that the answer is yes. Read the full article for details, but briefly, they’re using (of course) machine learning and features like message time, communication patterns compared with the social graph, etc. To give just one example, they can look for “an adult sending messages or friend requests to a large number of minors” as a possible sign of a pedophile.

Some of the effort is education: “Our new safety notices also help educate people on ways to spot scams or imposters and help them take action to prevent a costly interaction.” I have grave doubts that that will work well, but we’ll see.

The hard part is going to be knowing how well it works. How will they know about false positives, situations where they flag a communication as suspect but it really isn’t? For that matter, how will they know about true positives? I can think of possible answers, but I’ll wait for further details.

Software Done in a Hurry

29 April 2020

Not at all to my surprise, people are reporting trouble with an online site for applying for loans under the Paycheck Protection Program (PPP). This is software that was built very quickly but is expected to cope with enormous loads. That’s a recipe for disaster, and it’s less surprising that there have been problems than it would be for there to be no problems.

We saw this happen with the sign-up for Obamacare; I wrote similar things then. Software development is hard; rapidly building any system, let alone one that has to run at scale, was, is, and will likely remain difficult. That’s independent of which political party is in charge at the time.

That we can’t do software engineering perfectly doesn’t mean that it can’t be done better. was rescued fairly quickly, when the administration brought in a management team that understood software engineering. The same applies here: competent management matters.