Google releases detailed intervention rates -- and the real unsolved problem of robocars

Topic: 

Hot on the heels of my CES Report is the release of the latest article from Chris Urmson on The View from the Front Seat of the Google Car. Chris heads engineering on the project (and until recently led the entire project.)

Chris reports two interesting statistics. The first is "simulated contacts" -- times when a safety driver intervened, and the vehicle would have hit something without the intervention:

There were 13 [Simulated Contact] incidents in the DMV reporting period (though 2 involved traffic cones and 3 were caused by another driver’s reckless behavior). What we find encouraging is that 8 of these incidents took place in ~53,000 miles in ~3 months of 2014, but only 5 of them took place in ~370,000 miles in 11 months of 2015. (There were 69 safety disengages, of which 13 were determined to be likely to cause a "contact.")

The second is detected system anomalies:

There were 272 instances in which the software detected an anomaly somewhere in the system that could have had possible safety implications; in these cases it immediately handed control of the vehicle to our test driver. We’ve recently been driving ~5300 autonomous miles between these events, which is a nearly 7-fold improvement since the start of the reporting period, when we logged only ~785 autonomous miles between them. We’re pleased.

Let's look at these and why they are different and how they compare to humans.

The "simulated contacts" are events which would have been accidents in an unsupervised or unmanned vehicle, which is serious. Google is now having one once every 74,000 miles, though Urmson suggests this rate may not keep going down as they test the vehicle in new and more challenging environments. It's also noted that a few were not the fault of the system. Indeed, for the full set of 69 safety disengagements, the rate of those is actually going up, with 29 of them in the last 5 months reported.

How does that number compare with humans? Well, regular people in the USA have about 6 million accidents per year reported to the police, which means about once every 500,000 miles. But for some time, insurance companies have said the number is twice that, or once every 250,000 miles. Google's own new research suggests even more accidents are taking place that go entirely unreported by anybody. For example, how often have you struck a curb, or even had a minor touch in a parking lot that nobody else knew about? Many people would admit to that, and altogether there are suggestions the human number for a "contact" could be as bad as one per 100,000 miles.

Which would put the Google cars at close to that level, though this is from driving in simple environments with no snow and easy California driving situations. In other words, there is still some distance to go, but at least one possible goal seems in striking distance. Google even reports going 230,000 miles from April to November of last year without a simulated contact, a (cherry-picked) stretch that nonetheless matches human levels.

For the past while, when people have asked me, "What is the biggest obstacle to robocar deployment, is it technology or regulation?" I have given an unexpected answer -- that it's testing. I've said we have to figure out just how to test these vehicles so we can know when a safety goal has been met. We also have to figure out what the safety goal is.

Various suggestions have come out for the goal: Having a safety record to match humans. Matching good humans. Getting twice or even 10 times or even 100 times as good as humans. Those higher, stretch goals will become good targets one day, but for now the first question is how to get to the level of humans.

One problem is that the way humans have accidents is quite different from how robots probably will. Human accidents sometimes have a single cause (such as falling asleep at the wheel) but many arise because 2 or more things went wrong. Almost everybody I talk to will agree a time has come when they were looking away from the road to adjust the radio or even play with their phone, and they looked up to see traffic slowing ahead of them, and quickly hit the brakes just in time, narrowly avoiding an accident. Accidents often happen when luck like this runs out. Robotic accidents will probably mostly come from one single flaw or error. Robots doing anything unsafe, even for a moment, will be cause for alarm and the source of the error will be fixed as quickly as possible.

Safety anomalies

This leads us to look at the other number -- the safety anomalies. At first, this sounds more frightening. They range from 39 hardware issues and anomalies to 80 "software discrepancies" which may include rarer full-on "blue screen" style crashes (if the cars ran Windows, which they don't). People often wonder how we can trust robocars when they know computers can be so unreliable. (The most common detected fault is a perception discrepancy, with 119. It is not said, but I will presume these will include strange sensor data or serious disagreement between different sensors.)

It's important to note the hidden message. These "safety anomaly" interventions did not generally cause simulated contacts. With human beings, the fact that you zone out, take your eyes off the road, text or even in many cases even briefly fall asleep does not always result in a crash for humans, and nor will similar events for robocars. In the event of a detected anomaly, one presumes that independent (less capable) backup systems will immediately take over. Because they are less capable, they might cause an error, but that should be quite rare.

As such, the 5300 miles between anomalies, while clearly in need of improvement, may also not be a bad number. Certainly many humans have such an "anomaly" that often (that's about every 6 months of human driving.) It depends how often such anomalies might lead to a crash, and what severity of crash it would be.

The report does not describe something more frightening -- a problem with the system that it does not detect. This is the sort of issue that could lead to a dangerous "careen into oncoming traffic" style event in the worst case scenario. The "unexpected motion" anomalies may be of this class. (As such would be a contact incident, we can conclude it's very rare if it happens at all in the modern car.) (While I worked on Google's car a few years ago, I have no inside data on the performance of the current generations of cars.)

I have particular concern with the new wave of projects hoping to drive with trained machine learning and neural networks. Unlike Google's car and most others, the programmers of those vehicles have only a limited idea how the neural networks are operating. It's harder to tell if they're having an "anomaly," though the usual things like hardware errors, processor faults and memory overflows are of course just as visible.

The other vendors

Other vendors weren't all as detailed in their reports as Google. Bosch reports 2.5 disengagements per mile in the last 2 months of their report, but doesn't say what kind. Mercedes reports 0.75 disengagements **per mile*, about half of them automatic and about half manual.

Google didn't publish total disengagements, judging most of them to be inconsequential. Safety drivers are regularly disengaging for lots of reasons:

  • Taking a break, swapping drivers or returning to base
  • Moving to a road the car doesn't handle or isn't being tested on
  • Any suspicion of a risky situation

The latter is the most interesting. Drivers are told to take the wheel if anything dangerous is happening on the road, not just with the vehicle. This is the right approach -- you don't want to use the public as test subjects, you don't want to say, "let's leave the car auto-driving and see what it does with that crazy driver trying to hassle the car or that group of schoolchildren jaywalking." Instead the approach is to play out the scenario in simulator and see if the car did the right thing.

Delphi reports 405 disengagements in 16,600 miles -- but their breakdown suggests only a few were system problems. Delphi is testing on highway where disengagement rates are expected to be much lower.

Nissan reports 106 disengagements in 1485 miles, most in their early stages. For Oct-Nov their rate was 36 for 866 miles. They seem to be reporting the more serious ones, like Google.

Tesla reports zero disengagements, presumably because they would define what their vehicle does as not a truly autonomous mode.

VW's report is a bit harder to read, but it suggests 5500 total miles and 85 disengagements.

Google's lead continues to be overwhelming. That shows up very clearly in the nice charts that the Washington Post made from these numbers.

How safe do we have to be?

If the number is the 100,000 mile or 250,000 mile number we estimate for humans, that's still pretty hard to test. You can't just take every new software build and drive it for a million miles (about 25,000 hours) to see if it has fewer than 4 or even 10 accidents. You can and will test the car over billions of miles in simulator, encountering every strange situation ever seen or imagined. Before the car has a first accident it will be unlike a human. It will probably perform flawlessly. if it doesn't, that will be immediate cause for alarm back at HQ, and correction of the problem.

Makers of robocars will need to convince themselves, their lawyers and safety officers, their boards, the public and eventually even the government that they have met some reasonable safety goal.

Over time we will hopefully see even more detailed numbers on this. That is how we'll answer this question.

This does turn out to be one advantage of the supervised autopilots, such as what Tesla has released. Because it can count on all the Tesla owners to be the fail-safe (or if you prefer, guinea-pig) for their autopilot system, Tesla is able to quickly gather a lot of data about the safety record of its system over a lot of miles. Far more than can be gathered if you have to run the testing operation with paid drivers or even your own unmanned cars. This ability to test could help the supervised autopilots get to good confidence numbers faster than expected. Indeed, though I have often written that I don't feel there is a good evolutionary path from supervised robocars to unmanned ones, this approach could make my prediction be in error. For if Tesla or some other car maker with lots of cars on the road is able to make an autopilot, and then observe that it never fails in several million miles, then they might have a legitimate claim on having something safe enough to run unmanned, at least on the classes of roads and situations which the customers tested it on. Though a car that does 10 million perfect highway miles is still not ready to bring itself to you door to door on urban streets, as Elon Musk claimed would happen soon with the Tesla yesterday.

Comments

Will be able the companies to share the tests data?, and all the information each of them get, provided by the robocars or autonomous systems already in the streets?. If any software or hardware from company A proves to be better and safer than the others, will be a friendly share between them?. Remember the fight between VHS and Sony Beta system video. Customers did not have the option to test and decided. To get Beta (and soon small video8), meant get very good home movies but near zero comercial films to see. Must we wait until many accidents happens involved robocars from company B, to force companies to share info?. Without a good communication between robocars, safety will be always at risk. Without the same map, the same information about things that occurs around several a robocars running, communication will be a Babel tower.

To the last points that Brad made, having supervised autopilots in the first generation can be beneficial to getting to full autonomous cars more quickly for a couple of more reasons:
1. You could have the supervised auto-pilot in "train mode"
If the auto-pilot is neural network based, you can use this real-world data of ostensibly "good drivers" to help improve the existing Neural Networks.
2. You can have the supervised auto-pilot in "compare mode" (at the same time)
You will also be able to compare how the current algorithm fares next to a real-world driver. Would it have made the same decisions as the human driver? Were the choices close or vastly far off? If significantly different, then why?

Since the number of sensors and different kinds available to the autopilot, one would expect these cars to become significantly better than humans over time.

Right now, though, while highway autopilot can be pulled off with a camera and some radars, and a full highway self-driving car might also be possible with that some day, the urban street problem is harder, and is unlikely to be done with just those sensors. LIDARs and perhaps other sensors are the best solution there, for now and for a decent amount of time. Why go with human level abilities when you can be superhuman?

So the problem is that cars sold to ordinary folks won't have the sensor suite you want for your first (and worst) urban robocar. So it's hard to train and test this way, compared to the highway.

I wrote an article about how people aren't likely to trust robot cars as long as they feel they can do better:

http://www.danlevy.net/2015/12/18/when-ai-fails-and-the-crashing-robot-cars/

Add new comment