The utilitarian math overwhelming says we should be aggressive in robocar development. How do we do that?

Submitted by brad on Thu, 2018-12-20 11:00

Topic:

A frequent theme of mine has been my identification of "proving you have done it" as the greatest challenge in producing a safe robocar.

This accident was caused by a human driver. Like 12M others each year in the USA

Others have gone further, such as the Rand study which incorrectly claims you need to drive billions of miles to prove it.

Today I want to discuss a theoretical evaluation that most would not advocate, but which helps illustrate some of the issues, and discussions the social and philosophical angles of this new thing that the robocar is -- a major life-saving technology which involves risk in its deployment and testing, but which improves faster the more risk you take.

People often begin with purely "utilitarian" terms -- what provides the greatest good for the greatest number. The utilitarian value of robocars to safety can be simply measured in how they affect the total count of accidents, in particular fatalities. The insurance industry gives us a very utilitarian metric by turning the cost of accidents into a concrete dollar figure -- about 6 cents/mile. NHTSA calculated economic costs of $277B, with social costs bumping the number to $871B -- a more shocking 29 cents/mile, which is more than the cost of deprecation on the vehicle itself in most cases.

The TL;DR of the thesis is this: Given a few reasonable assumptions, from a strict standpoint of counting deaths and injuries, almost any delay to the deployment of high-safety robocars costs lots of lives. And not a minor number. Delay it by a year, and anywhere from 10,000-20,000 extra people will die in the USA, a 300,000 to a million around the world. Delay it by a day and you condemn 30-80 unknown future Americans and 1,000 others to death, and many thousands more to horrible injury. These people are probably strangers to you. They will not be killed directly by you, they will be killed by reckless human drivers in the future whose switch to a robocar was delayed. The fault for the accidents is on those drivers, but the fault for the fact those reckless people were driving in the first place will be on those who delayed the testing and subsequent deployment. This doesn't mean we should do absolutely anything to get these vehicles here sooner, but it does mean that proposals which risk delay should be examined with care. Normally we only ask people to justify risk. Here we must also ask to justify caution. And it opens up all sorts of complex moral problems.

This suggests a social strategy of encouraging what I will call "aggressive development/deployment with reasonable prudence."

Reducing fatalities

Let's just look at the total fatalities. The goal of every robocar team is to produce a vehicle which is safer than the average human driver -- it produces better numbers than those above. Eventually much safer, perhaps 8x as much, or even more. Broadly it is hoped fatalities will drop to zero one day.

This is so much the goal that most teams won't release their vehicle for production use until they hit that target. Rand and others argue that it's really hard to prove you have hit that target, since it takes a lot of data to prove a negative like "we will cause less than one fatality for every 100 million miles."

On the other hand, we let teen-agers out on the road even though they are clearly worse than that. They are among the most dangerous and reckless drivers among us, yet included in those totals. We let them out because it's the only way to turn them into safer adult drivers, and because they need mobility in our world.

On the other end, especially after Uber's fatality, some are arguing that testing robocars with safety drivers is reckless, not just the sloppy way Uber did it.

The Rand study cited above, which claims that since you can't have proper statistical certainty about a high safety level until you test for an untenable amount of miles, has made people speculate that deployment should be delayed until a way to prove this is figured out. Corporate boards, making decisions on deployment, most ponder how much liability they are setting the company up for if they deploy with uncertain safety levels.

A learning safety technology

Robocars are a safety technology. They promise to make road travel safer than human driving, and seriously cut the huge toll of deaths and injuries. Unlike many other life-saving technologies, though, they "learn" through experience. I don't mean that in the sense of machine learning, I mean it in the sense that every mistake made by a robocar gets corrected by its software team, and makes the system better and safer. Every mile in a place where surprise problems can happen -- which primarily means miles on public roads -- offers some risk but also makes the system better. In fact, in many cases, taking the risk and making mistakes is the only way to find and fix some of the problems.

We might call this class of technologies "risk-improved" technologies. Other technologies also improve, but not like this. Drugs don't get better but we learn how to administer them. Medical procedures get better as we learn how to improve them and avoid pitfalls. Crashes in cars taught us how to make them safer.

Software is different though. Software improves fast. Literally, one mistake, made by one car in one place will be known immediately to its team. They will understand it quickly and have a fix in some cases within hours. Deploying the fix immediately is risky so it may takes days or weeks, but it's not like other fields where changes take years. Since mistakes are, if they are serious, made in public, the lesson can often be learned by all developers, not just the one who had the event. (Software also notoriously breaks by surprise when you try to fix it, because it's so complex, but the ability to improve is still far greater than anything else.)

Likely assertions about the future path

I believe the following assumptions about the development and deployment of robocars are reasonable, though things can be argued about all of them.

At least in the early years, improvements in their safety level will come with time, but mostly they will come with miles on real world roads. The more miles, the more learned about what happens, the more problems fixed, the more data about what works is gathered.
Some of this learning can and should be done in simulators and test tracks. However, this is not as effective, and we are much less far along at making it approach the effectiveness of on-road testing, if we ever can.
Deployment will largely happen on a growth curve starting the day that the teams are ready, and will proceed along an exponential growth curve, slowing down as markets saturate. For example, if it starts at 0.25% of saturation and doubles every year, it will reach saturation in about 9-10 years from when the effort starts. If it starts 1 year later, it (roughly) reaches saturation 1 year later.
This deployment curve will also apply in other countries. Chances are if the most eager places (USA, China) are a year later, then the other places are also pushed back, though possibly not a full year.
Safety levels will also increase with time, though that slows down as increasing safety gets harder and harder due to diminishing returns. It is assumed the vehicles can get to at least 5x better than the average human, though not much more than 10x better.

In particular, the following steps take place:

Off-road testing, where no civilian is put at risk
On-road testing with safety drivers, which generally appears very safe if done properly, but has failed when not done properly
Early deployment with no safety drivers, at the chosen safety level (probably comparable to human drivers,) with the safety level increasing to be far superior to human drivers
Growth -- probably exponential like popular digital technologies, with improvement in safety tapering off as limits are reached
Saturation -- when growth tapers off, at least until some new technology attracts additional markets.

Later, we'll examine ways in which these assumptions are flawed. But they are closer to correct than the counter assumptions, such as the idea that safety improvement will go just as fast without road deployment, or that saturation will occur on the same date even if deployment begins a year later.

Taking them as given for the moment, this leads to the strong conclusion. Almost any delay will lead to vastly more deaths. If "reckless" deployment will speed up development and/or deployment, this leads to vastly fewer deaths. Even ridiculously reckless deployment, if it speeds up development and deployment, saves immense numbers of total lives.

The reason is simple -- the deaths resulting from risky deployment or development happen in the first years, when the number of miles driven is very low. It doesn't matter if your car is 80 times worse than humans and has one fatality every million miles, you're "only" going to kill 100 people in your first 100M miles of testing. But your eventual safer car, the one that goes 500M miles without a fatality, will be the one driving a trillion miles as it approaches saturation. And in replacing a trillion miles of human driving, 12,500 people who would have died in human caused crashes are replaced by 2,000 killed in robocar crashes. For every year that deployment got delayed.

The magnitude of the difference is worse. When I suggested 100 lives might be lost to 100M miles of testing, I gave very little credit to the developers. Waymo has driven 10M miles and caused only one minor accident. They are doing some runs with no safety drivers, indicting they think they've already reached the human level of safety. They did this on fairly well regulated roads, so there is much more to do, but it could well be the great future benefits come at no cost in injuries or deaths, at least as they have done it.

Uber has not done so well. They have had a fatality, and in not too much distance. They were reckless. Yet even their reckless path would still result in a massive net savings of lives and injuries. Massive. Immense. The reality seems to be that nobody is likely to get to 10s or 100s of millions of miles with a poor quality vehicle unless they are supremely incompetent, far less competent than Uber.

The dangerous car has to learn

As I wrote above, computer technology is adaptive technology. Each mistake made once is, in theory, a mistake never made again. Every time a human has killed somebody on the road, it has almost never taught other humans not to make that mistake. If it has, it did so very slowly, and mostly through new laws or changes in road engineering. It might even be argued that the rate of improvement in a robocar is fairly strongly linked with the amount of testing. The more it drives, the more it is improved. The early miles uncover all the obvious mistakes, and soon, they start coming less often. 100,000 miles might get it to a fairly competent level, but a million more might be needed for the next notch, and 10 million for the one after that. But that also means that it drives these larger numbers of miles with less risk. As mileage grows, the system is improved and risk per mile reduces. Total risk per year does not grow as quickly as miles, not nearly as quickly.

We are not pure utilitarians

If you simply look at the body count, there is a strong argument to follow the now maligned credo of "move fast and break things." But we don't look purely at the numbers, for both rational and emotional reasons.

The strongest factor is our moral codes which treat very differently harm caused by us and harm caused by others. We feel much worse about harm caused by ourselves than about harm caused by others, or by nature, even though the pain is the same. We feel much more strongly about harm caused by ourselves that we could have prevented.

We see this in medicine, where the numbers and logic are surprisingly similar. If we released new drugs without testing, many would die from those that are harmful. But far, far, far more would be saved by the ones that actually work. Numerical studies have gone into detail and shown that while the FDA approval process saves thousands, it leaves millions to death. We allow, and even insist on this horrible tragedy because those who are not saved are not killed directly by us, they are killed by the diseases which ail them.

All the transportation technologies killed many in their early days, but modern culture has grown far less tolerant of risk, especially mortal risk. It makes rare exceptions: Vaccines may seriously harm 1 in 10,000 (it varies by vaccine) but are recommended even so because they save so many. The makers are even given some immunity from liability for this.

In addition, whatever the math says, we are unlikely to tolerate true "recklessness," particularly if it is pointless. If there is a less risky path that is just as fast, but more expensive, we want people to take it. Sending cars out without safety drivers is very risky, and the cost of safety drivers is affordable to the well-funded companies. (Indeed, the safety drivers, by gathering data on mistakes, probably speed up development.)

It might seem this logic would encourage any risk that promises live saving. That we should sacrifice babies to demons if it would get us deployment and saturation a day sooner. But we don't think this way, which is why we're not pure utilitarians, and at least partly deontological (caring about the intrinsic right or wrong of our actions.) Different moral theories strike different balances between the two evils under debate -- namely putting the public at extra risk due to your actions, and leaving the public at higher risk due to your inaction.

Whatever non-utilitarian philosophy might be advocated, however, when the numbers are this large, it still has to answer the utilitarian challenge: Is the course advocated worthy of the huge numbers? We might well believe it is better to prevent 100 immediate deaths to save 40,000 future deaths -- but we should understand numerically that this is what we are doing.

There is of course tremendous irony in the fact that some of these questions about harm through action and inaction are at the core of the "trolley problem," a bogus version of which has become pernicious distraction in the robocar world. Indeed, since those who discuss this idiotic problem usually end up declaring that development should slow until it is answered, they find themselves choosing between avoiding the distraction or causing delay in development -- and the loss of many lives. The philosophy class trolley problem is present in these high level moral debates, not in the operation of a vehicle.

Public reaction has a big effect on the timeline

One of the largest factors altering this equation is that early mistakes don't just cause crashes. Those crashes, especially fatalities like Uber's, will cause public backlash which has a high chance of slowing down both development and deployment. Uber had to shut down all operations after their incident, and some wondered if they would ever come back. Other companies besides Uber felt the problem as well. There will be backlash from the public, and possibly from regulators.

The public can be harsh with its backlash. When Toyota had reports of accelerator pedals sticking, it caused them much pain, and while it still gets argued, and the Toyota software was pretty poor, the general final report was that there was not actually a problem. Toyota still took a hit.

On the other hand, even with multiple people killed while driving with Tesla Autopilot (without paying proper attention) and several investigations of that accident, there seems to be no slowdown in the sales of Teslas or Autopilot or the price of Tesla stock. This factor is very difficult to predict. This is in spite of the fact that many people mistakenly think the Tesla is a kind of robocar.

This effect can be wildly variable, depending on the nature of the incidents. Broadly, the worst case would be fatalities for a vulnerable road user (like a pedestrian) particularly a child. Many other factors will play in, including the apparent preventability and how identifiable the human story behind the tragedy is. Uber was "lucky," and Elaine Herzberg very unlucky, that their fatality involved a homeless woman crossing at a do-not-cross-here sign.

The public is disturbingly inure to car crashes and car fatalities, at least when caused by humans. They barely make the news. Even the 5,000 annual fatalities for vulnerable road users don't get a lot of attention.

Robocars are risk-improved, but have a counter flaw: They put non-participating members of the public at risk. Aircraft rarely kill people on the ground. Drugs don't kill people who don't take them. Cars (robotic and human driven) can kill bystanders, both in other cars, and vulnerable road users. This does not alter that overwhelming utilitarian math, but it does alter public perception. We are more tolerant of casualties who knowingly participated, and much, much less of 3rd party casualties.

Legal reaction

In addition to public reaction, the justice system has its own rules. It does not reward risk-taking. In the event of an accident, the people saved in the future are not before the jury. High risk taking can even result in punitive damages in rare cases, and certainly in higher product liability negligence claims.

Thinking beyond individual casualties to analysis of risk

Each car accident is an unintended event, each death a tragedy. The courts and insurance claims groups handle them in accord with law and policy. But policy isn't about individual events, but rather risk. If you, or your robot, hurt somebody, the courts act to make them whole as much as possible, at your expense. This will always be the case. If you show a pattern of high risk (such as a DUI) you may be banned from the road.

Public policy and our philosophy, however, mostly deal with risk rather than individual tragic events. Every driver who goes on the road puts him/herself at risk, and puts others at risk. For most people, this is the most risk they place others in. When a person or company deploys a robocar on the road they are also putting others at risk (and the passenger.) It's a similar situation. A mature robocar is putting those people at far less risk than the driver is, though it's never zero. A prototype robocar with good safety drivers appears to be also putting the public at less risk than an ordinary driver. A prototype robocar with a poor safety driver is putting the public at more risk.

Our instinct is to view the individual accident and death as the immoral and harmful act. We may be better served to ask if the act we judge the morality of is the degree to which we place others at risk. If you look at the original question, "Is it moral to kill a few additional people today to save hundreds of thousands in the future" you may say no. If the question is "Is it moral to very slightly increase the risk on the roads for a short time to massively decrease it in the future?" the answer may be different.

Aggressive Development with Reasonable Prudence

All this leaves us with a dilemma. If we were purely utilitarian, we would follow a rather cold philosophy: "Any strategy which hastens development and deployment is a huge win, as long as bad PR from incidents does not slow progress more." And yes, it's coldly about the PR of the deaths, not the deaths themselves, because the pure utilitarian only cares about the final number. The averted deaths from early deployment are just as tragic for those involved, after all, but not as public.

We must also consider that there is uncertainty in our assumptions. For example, if the problem is truly intractable, and we never make a robocar safer than human drivers, then any incidents in its development were all negative; the positive payback never came. The risk of this should be factored in to even the utilitarian analysis.

All of this points to a strategy of "aggressive development with reasonable prudence." To unpack this, it means:

Risk should be understood, but taken where it is likely it will lead to faster development. Being overly conservative is highly likely to have huge negative results later
If money can allow reduced risk while maintaining speed of development, that is prudent. However, restricting development only to large super-rich companies likely slows development overall.
Risks to external parties, especially vulnerable road users, should get special attention. However, it must be considered that many of those saved later will be such parties.
In combination of those factors, what can be done on test tracks and in simulation should be done there -- but what can't be done as quickly in those environments should be done on the public roads
Needless risk is to be discouraged, but worthwhile risk to be encouraged

The next level of reasonable prudence applies to deployment. While (outside of autopilots) nobody is considering commercial deployment of a vehicle less safe than human drivers, this math says that is would not be as outrageous a strategy as it sounds. Just as delay in development pushes back the date of deployment, delaying deployment pushes back saturation. For it is at saturation that the real savings of lives comes, when robocars are doing enough miles to put a serious dent in the death toll of human driving, by replacing a large fraction of it.

It should be noted that the early teams, such as Waymo, probably were not aggressive enough, while Uber was not prudent enough. Waymo appears to have reached a 10 million miles and a level of safety close to human levels, at least on simple roads, and is close to deployment. They did it with only a single minor at-fault accident. They did this by having a lot of money and being able to craft and follow a highly prudent safety driver strategy.

Uber on the other hand, has not reached those levels. They tried to save money on safety drivers and it came at great cost. While no team is immune from wanting to save money,

The assumptions

Since many will challenge this reasoning, they will challenge the numbers which suggest it. I do believe there are many open questions about the assumptions which can be examined, but that the numbers are so immense that this is not a priority unless the challenge suggests an assumption is wrong, not by just a little, but by a few orders of magnitude.

That this is doable at all

There is a core assumption that making a robocar that can drive much more safely than the average human is possible. If it's not possible, or very distant in the future, then these risks are indeed unjustified. However, there is a broad consensus of belief that it is possible.

We need road testing

Some question the need for the risk of on-road testing. With more work, we should find ways to test more things on test tracks and in simulators. This is probably true, but few believe it's completely true. In any event, none would doubt that doing this takes time, and thus delays development. So even if you can find a risk-free way to test and develop a decade from now, the math has caught up with you.

Delayed deployment is delayed saturation

It's possible, but seems unlikely, that if deployment is delayed, either due to slowing down development or waiting for a higher level of safety, that growth would happen even faster after the delay, and that penetration would "catch up" to where it would have been.

Of course, we've never gotten to do that experiment in history. We can't test, "What if smartphones had been launched 4 years later, would it have taken 4 more years for them to be everywhere?" We do know that the speed of penetration of technologies is going up and up. The TV took many decades to get in every home, while smartphones did this much faster, and software apps can do it in months. Car deployment is a very capital and work intensive thing, so we can't treat it like a software app, but the pace will increase. But not by an order of magnitude.

Penetration speed depends on many things -- market acceptance and regulations for starters. But they follow their own curve. The law tends to lag behind technology, not the other way around, so there is a strong argument that the tech has to get out there for the necessary social and regulatory steps to happen. They won't happen many times faster if the deployment is delayed.

Human drivers will get digital aid and perform better

Perhaps the largest modifier to the numbers is that human drivers will also be getting better as time goes on, because their cars will become equipped with robocar-related technologies from traditional ADAS and beyond. So far ADAS is helping, but less than we might expect, since fatalities have been on the rise the last 3 years. We can expect it to get better.

It is even possible that ADAS could produce autopilot systems which allow human driving but are almost crash-proof. This could seriously reduce the death toll for human driving. Most people are doubtful of this, because they fear that any really good autopilot engenders human complacency and devolves to producing a poor robocar rather than a great autopilot. But it's possible we could learn better how to do this, and learn it fast, so that the robocars are producing a much more modest safety improvement over the humans when they get to saturation.

Conclusion

It's unclear if society has faced a choice like this before. The closest analogs probably come from medicine, but in medicine, almost all risk is for the patient who wants to try the medicine. The numbers are absolutely staggering, because car accidents are one of the world's biggest killers.

This means that while normally it is risk which must be justified, here it is also caution that must be justified.

Teams, and governments, should follow the policy of aggressive development and reasonable prudence, examining risks, and embracing the risks that make sense rather than shying away from them. Government policy should also help teams do this, by clarifying the liability of various risk choices to make sure the teams are not too risk-averse.

That's quite a radical suggestion -- for governments to actually encourage risk-taking. Outside of wartime, they don't tend to think that way. Every failure because of that risk will still be a very wrong thing, and we are loathe to not have done everything we could to prevent it, and especially to do anything that might encourage it. We certainly still want an idea of what risks are reckless and which are justified. There is no zero risk option available. But we might decide to move the line.

In spite of this logic, it would be hard to advise a team to be aggressive today with risk. It might be good for society, and even be good for the company in that they can be a leader in a lucrative market, the public reaction and legal dangers can go against this.

We might also try to balance the risks as guided by our morals. Liability might be adjusted to be higher for harm to bystanders (especially vulnerable road users) and lower for willing participants.

We should not be afraid of low, vague risks. A vehicle that is "probably safe" but has not been proven so in a rigourous way should still be deployable, with liability taken by those who deploy it. There should not be a conservative formal definition of what safe is until we understand the problem much more. If some risk is to be forbidden, by law or by liability rules, there should be a solid analysis of why it should be forbidden and what it will save to forbid it, and what it will cost. We must shy away from worrying about what "might happen" if we are very unsure, because we are pretty sure about what will happen on the roads with human drivers if deployment is delayed, and that's lots and lots of death.

Share on:

Comments

Anonymous

Thu, 2018-12-20 16:39

You are here