Software recalls and quick fixes to safety-critical computers in robocars

While giving a talk on robocars to a Stanford class on automative innovation on Wednesday, I outlined the growing problem of software recalls and how they might effect cars. If a company discovers a safety problem in a car's software, it may be advised by its lawyers to shut down or cripple the cars by remote command until a fix is available. Sebastian Thrun, who had invited me to address this class, felt this could be dealt with through the ability to remotely patch the software.

This brings up an issue I have written about before -- the giant dangers of automatic software updates. Automatic software updates are a huge security hole in today's computer systems. On typical home computers, there are now many packages that do automatic updates. Due to the lack of security in these OSs, a variety of companies have been "given the keys" to full administrative access on the millions of computers which run their auto-updater. Companies which go to all sorts of lengths to secure their computers and networks are routinely granting all these software companies top level access (ie. the ability to run arbitrary code on demand) without thinking about it. Most of these software companies are good and would never abuse this, but this doesn't mean that they don't have employees who can't be bribed or suborned, or security holes in their own networks which would let an attacker in to make a malicious update which is automatically sent out.

I once asked the man who ran the server room where the servers for Pointcast (the first big auto-updating application) were housed, how many fingers somebody would need to break to get into his server room. "They would not have to break any. Any physical threat and they would probably get in," I heard. This is not unusual, and often there are ways in needing far less than this.

So now let's consider software systems which control our safety. We are trusting our safety to computers more and more these days. Every elevator or airplane has a computer which could kill us if maliciously programmed. More and more cars have them, and more will over time, long before we ride in robocars. All around the world are electric devices with computer controls which could, if programmed maliciously, probably overload and start many fires, too. Of course, voting machines with malicious programs could even elect the wrong candidates and start baseless wars. (Not that I'm saying this has happened, just that it could.)

However these systems do not have automatic update. The temptation for automatic update will become strong over time, both because it is cheap and it allows the ability to fix safety problems, and we like that for critical systems. While the internal software systems of a robocar would not be connected to the internet in a traditional way, they might be programmed to, every so often, request and accept certified updates to their firmware from the components of the car's computer systems which are connected to the net.

Imagine a big car company with 20 million robocars on the road, and an automatic software update facility. This would allow a malicious person, if they could suborn that automatic update ability, to load in nasty software which could kill tens of millions. Not just the people riding in the robocars would be affected, because the malicious software could command idle cars to start moving and hit other cars or run down pedestrians. It would be a catastrophe of grand proportions, greater than a major epidemic or multiple nuclear bombs. That's no small statement.

There are steps that can be taken to limit this. Software updates should be digitally signed, and they should be signed by multiple independent parties. This stops any one of the official parties from being suborned (either by being a mole, or being tortured, or having a child kidnapped, etc.) to send out an update. But it doesn't stop the fact that the 5 executives who have to sign an update will still be trusting the programming team to have delivered them a safe update. Assuring that requires a major code review of every new update, by a team that carefully examines all source changes and compiles the source themselves. Right now this just isn't common practice.

However, it gets worse than this. An attacker can also suborn the development tools, such as the C compilers and linkers which build the final binaries. The source might be clean, but few companies keep perfect security on all their tools. Doing so requires that all the tool vendors have a similar attention to security in all their releases. And on all the tools they use.

One has to ask if this is even possible. Can such a level of security be maintained on all the components, enough to stop a terrorist programmer or a foreign government from inserting a trojan into a tool used by a compiler vendor who then sends certified compilers to the developers of safety-critical software such as robocars? Can every machine on every network at every tool vendor be kept safe from this?

We will try but the answer is probably not. As such, one result may be that automatic updates are a bad idea. If updates spread more slowly, with the individual participation of each machine owner, it gives more time to spot malicious code. It doesn't mean that malicious code can't be spread, as individual owners who install updates certainly won't be checking everything they approve. But it can stop the instantaneous spread, and give a chance to find logic bombs set to go off later.

Normally we don't want to go overboard worrying about "movie plot" threats like these. But when a single person can kill tens of millions because of a software administration practice, it starts to be worthy of notice.

Comments

How many times have we heard "switch to manual override!" in scifi movies where a malevolent force has taken over the ship? Sure, it never works in Star Trek, but it still seems like a good idea here.

Robocars should include an accessible manual brake level (with a big IN CASE OF EMERGENCY warning sticker) that passengers can engage. That will at least make the *passengers* (and those around the cars with passengers) safer against movie-plot threats. We might also want to have a whole "manual override" backup control panel that lets one drive slowly under joystick control with the warning lights on utilizing different software than the main system. So if things get hairy and you get a text message telling you cars have started going crazy, there's a backup plan.

But a joystick reminds me of the Star Trek manual override which consists of pushing a button on the computer so that you can push other buttons on the computer. This won't help you against actual malicious firmware -- unless the joystick control system is a different firmware which can't be updated remotely.

In general, I have been proposing the use of some level of "minder" software which sits between the control systems and the physical systems, and also has raw access to the sensors. There could be two sorts of minders. The main use would be for new prototype robocar control systems. The minder would watch what was going on, and abort to a safe state if the prototype software does something unsafe. This means that developers of new prototypes will have to constrain them within the minder's bounds, or have the minder rewritten to accept their new techniques once they are shown to be safe.

A lower level, simpler minder could exist to stop things that are undisputably unsafe, like hitting pedestrians when there are obvious other choices, or going off cliffs. However, a nasty programmer could probably figure ways to trick the simpler minder and still do some, if less damage.

As you indicate, the manual override, which is a good idea, does not help if the vehicles are unoccupied. They can still go on their robot rampage. It's possible that we could build minders which enforce strict rules only on vacant cars and delivery robots. After all, vacant cars don't need to be nearly as fast or drive as aggressively as passengers will demand. Vacant cars also don't mind screeching to a halt if something is amiss, though the people behind might mind.

If the malicious control system can't fool the minder into thinking the vehicle is occupied, this could keep things safe. The minder can detect weight by looking at things like power to acceleration ratios and various other tricks. But if you make a mistake and the minder can be fooled, you are back at the dangerous threat.

The update problem isn't limited to auto-update. With access to a clock, regular updates could cause the same sort of catastrophe. It seems to me that the best way to deal with things is to do what airliners do (they are entirely operated via fly-by-wire these days): use 3-4 redundant and different systems that vote, and send back statistics. If the company does an update (limited to one system at a time) and one of the systems is getting outvoted above some threshold, it obviously has a bug/malicious feature.

Of course, that doesn't help if companies want to add a significant feature, but when it comes down to it, the economy runs on trust - most people never check for bombs under the hoods of their cars.

At a fundamental level, all of these calculations are economic - people run auto-updaters because they haven't caused problems in the past. No one pays for risk-free tools for much of anything - it is simply too expensive.

"However, it gets worse than this. An attacker can also suborn the development tools, such as the C compilers and linkers which build the final binaries."

AFAIK, OpenBSD vets everything. That might be a good place to start.

"The minder would watch what was going on, and abort to a safe state if the prototype software does something unsafe."

You do realize that this is dramatically more difficult than making a safe robo-car, right? Or, to put it another way, if the minder were actually in place, creating the firmware would largely be a matter of integrating a GPS/inertial navigation system with something that continuously queries the minder.

I don't think the minder is nearly as complex as the robocar system. For example, it does no navigation, no picking where to turn or where to go. It doesn't have to figure out what people are and what their intentions are. It does not have to understand street signs or negotiate with other cars. It just has to detect when you are going to hit something, and mitigate that. It doesn't have to figure out what to do about a pedestrian running into the road or any of the unusual special cases that a perfectly safe vehicle needs to deal with. The minder is there to stop the main system from doing something obviously dumb. Malicious or buggy code won't do too much damage if it has to wait for a pedestrian to jump in the street, or another car to veer suddenly. The minder may well let you blow through a red light, as long as you aren't going to hit anybody when you do it. (There are arguments that the regular software also can ignore traffic signals if the car will not get in anybody's way in doing so.)

The idea of voting is indeed a good one, by the way, and the minder is probably what would count the votes. However, voting does not help us against malicious attack as much as we would like. The attacker can wait until a majority of the voters have been updated, and then it has control. This does give more time to examine the problem. However, this also means the auto-update can't be used to fix a safety problem quickly.

The issue with software recall is liability. A company finds a problem in the software. Their lawyers tell them, "now that you know about this, you have to do everything in your power, otherwise if somebody dies for a bug you knew about, you will get huge punitive damages." So they may feel forced, if they don't have a fix yet, to take the extreme case of a software recall, shutting down or limiting the software. If this happens a lot, it's doom for the company, but more to the point doom for the industry as the public won't tolerate products that don't work in the morning because the lawyers didn't want them to.

As for OpenBSD, it doesn't help too much. All it takes is one vulnerability, one piece of software that gets root (of any type, not just the compiler) and it can then replace the binary of the compiler. Some might argue that here open source fails you slightly, as it is easier for the attacker to make an almost identical compiler which contains their trojan. You can do this without the source, but the source makes your task easier.

I cannot imagine 'continuous' Internet-based updates to critical software systems on an ongoing basis.

VPNs, perhaps -- if combined with very robust testing and exercise of the update at several levels, and preservation of the entire 'replaced' codebase locally (so it can be restored, perhaps dynamically, without external connections of any kind).

My own systems approach to updates requires a HARD enable; the railroad systems utilizing a physical keylock. The key must be inserted and turned to permit any changes to the code, and additional forms of 'system security' and assurance (individual codes, cards, biometrics, etc. to identify the person authorizing the change) are also active.

"Presumably" vehicular software will not have immediate or common-mode failure across a wide range of instantiations. I can't think offhand of an emergent issue that could not be handled as if it were a 'recall' -- e.g. taken to a service point, or having a procedure run on it. In severe cases... a manufacturer's rep or trained serviceperson goes to the vehicle's location, accesses the software with appropriate hardware security, and verifies the installed base before buttoning up the system.

Now, I can easily see online VERIFICATION of the installed code, perhaps via wireless (e.g. the quick compatibility check needed to permit access of robot vehicles to modern automated highways). I can (perhaps) see dynamically-downloadable applets for non-critical parts of the operating software... 'skins' for the various nonessential UIs in the vehicle being one example... but certainly no part of the guidance, to say nothing of collision avoidance.

I freely confess to being very old-womanish (with no disrespect at all implied toward senior members of the female gender) regarding critical-systems software. That does not extend, however, to making the very common mistake that the 'cure' for systems failure in an autonomous vehicle is to provide some sort of 'hold everything' emergency brake lever accessible to any of the passengers! (Picture what would happen if you hit your brakes as hard as possible without working brakelights in any lane of the 405... any more questions? I thought not...)

Add new comment