Safety Drivers for Robocars -- the issues and rationale
The wake of Tesla's incident has caused a lot more questions about the concept of testing prototype robocars on public roads supervised by "safety drivers." Is it putting the public at risk for corporate benefit? Are you a guinea pig in somebody's experiment against your will? Is it safe enough? Is there another way?
The simple first answer is that yes, it is putting the public at risk. Nobody expects the cars to be perfect, and nobody expects the safety driver system to be perfect.
The higher level question is, "how much risk?" and is it the sort of risk we can or should tolerate.
For contrast, consider the question of teen-age novice drivers, who are also allowed out on the road, first with an adult supervisor (who is often a driving instructor but not required to be,) and then after a ridiculously simple test on their own. More recently, they have been restricted in what they can do on their own until they become adults.
We usually start the teen out in a parking lot or private road to get the basics, but very quickly that becomes not very useful, and they must go on the real road.
The driving instructor is very much like the safety driver. Many student driver cars have a 2nd brake pedal for the driving instructor to use. I remember the first time a car passed me (with what seemed just like inches) and I swerved away, and the driving instructor used the brake on me.
We allow an unskilled, reckless teen on the road for no other reason than to help that teen build skills to become a better driver. Statistics show teens are reckless for several years to come. We allow them on the road so they can become better mature drivers. Each individual's training helps that individual, not society. The benefit to society is that we don't have another system to turn us into more mature and safer drivers. (Sort of. In some countries, a lot more training is demanded of teens before they hit the road.)
Another analogy is flying -- airliners are flown on automatic most of the time, including on landing, with the pilots overseeing and ready to take over at any time. It works well, though other than at landing, the task is simpler and the pilots have plenty of time to fix things when they take over.
The same approach has been taken with robots. They also start out in the "parking lot" or test tracks, but quickly the limitations of this become clear.
Unlike the teen, developing the robocar doesn't just develop the particular car being tested. Everything learned, every improvement, goes into all the cars in the fleet, forever. It's as if sending one teen to driving school taught their whole cohort. That's a nice win.
On the other hand, we have a lot more money and time to develop the cars. The budget is in the billions, not thousands. So if there are safer alternatives, we can afford them. In addition, humans learn fast, and start out much smarter about certain things, like perception and decision making, than robots do. Robots start out being much more diligent and predictable, and have superhuman sensing ability in some cases, and sub-human in others.
Safety driving is a bit harder than driving instructing, because it takes so long. The better the robocars get, the harder it is to pay constant attention, the easier it is to get lulled into complacency. You can work to improve the diligence of safety drivers, and should, but it's also important to measure it. You can calculate, "what are the odds that a safety driver will miss a safety incident and not take over in time?" It might be one incident in 100, or 1,000 or 100,000. You can test people, on the road or in driving simulators, to learn general capabilities of humans, or of particular classes of humans.
It's also been popular, since pioneered by Google and derived from techniques of the DARPA grand challenge teams, to have two humans in a prototype robocar. One is behind the wheel and constantly watching the road. The other tends to spend most of her time monitoring the software to make sure all is going well, but also looks at the road fairly often. The second person, sometimes called the software operator, can also "spell" the main safety driver in relatively safe situations. If the main driver wants to look away for a couple of seconds or adjust themselves in their seat, it's not that dangerous to do if the software operator is watching the road, and can shout about anything urgent. That may seem unsafe, but it's actually wise to give people short breaks from any monotonous task, even breaks of seconds. Solo drivers do it all the time.
Having two people also makes the work more social, and less boring. It's unlikely the main driver will completely zone out next to their colleague. So you can calculate the performance of the team -- how often will they miss an incident -- and it should be better than that of a solo person.
You are also constantly measuring the performance of the car. How often does it need true help from the safety drivers? It might be once every 13 miles, as reportedly was the number at Uber, or every 80,000 miles as Google/Waymo once reported.
Fortunately, the errors of the car and the errors of the safety driver should be reasonably independent events. In some cases, they would be somewhat negatively correlated. For example, the car may be more likely to have a problem in complex situations, but the safety drivers might be more diligent in complex situations and thus be less likely to miss something. However, there is also the bad factor -- as the car gets better, the safety drivers get a bit more complacent and thus their performance drops.
All of this means you can make predictions about the combined system of robot and safety drivers. And if you can get the whole system down to having numbers like a regular human driver, you've made it so that deploying your test car is at the same level of danger as sending an ordinary human out driving in an ordinary car. In other words, the project can be at the level where they are putting the public at risk at same level as pizza delivery puts people at risk. Pizza delivery does put people at risk, and we're willing to accept it with the only benefit to society being tasty pie at home.
That human level seems to be around this:
- 1 in 100,000: Small ding
- 1 in 250,000: Insurance ding
- 1 in 500,000: Accident reported to police
- 1 in 2,000,000: Injury accident
- 1 in 80,000,000: Fatality
- 1 in 180,000,000: Fatality on highway
- 1 in 600,000,000: Pedestrian fatality
I will not claim that it is simple to measure the safety performance of the combined system, but once you do, if you can get it decently above these levels, I don't think we should feel these projects are inherently putting the public at risk. Though it should also be remembered that these projects are just not out to deliver pizza pies. They are trying to change the world of transportation and save huge numbers of lives once they succeed. Of course, if they get the numbers seriously wrong, then there is a good case that they are indeed creating unacceptable risk.
Consider another product that we judge as not putting the public at risk, namely cruise control, in particular adaptive cruise control. If you use cruise control, as far as the pedals are concerned, you are just a supervisor. However, we all know that you need to regularly adjust the wheel and sometimes have to hit the brakes. With regular cruise control you are adjusting it every few miles, depending on how busy traffic is. Because it is so frequent, you stay alert, and it's rare to hear of somebody missing a cruise control event and not hitting the brakes when traffic slows up ahead. Tesla autopilot adds lanekeeping, and this has caused some people to ignore the road, but Tesla claims the number is small and the overall performance of autopilot+human is still better than the numbers above.
Improving safety drivers
None of this suggests that projects should not do everything reasonable to improve the performance of their safety drivers. Having 2 instead of 1 is just one such technique. Early projects did not monitor the gaze of safety drivers, but as that technology has become more readily available, I believe it will become common. We will also see more research on the performance of safety drivers and what affects it.
Assisting safety drivers
It is also possible for automated systems to assist safety drivers. Usually the driving software is constantly monitoring itself and the car, and can issue audible alerts if something unusual is detected so that the safety driver either takes over immediately or is more diligent.
It is also possible to install completely independent collision detection systems and have them make audio cues when they see something. Of course, inherently these add-on systems will be inferior to the robocar system -- otherwise it's a pretty poor robocar system and why are they using it? -- so there is the risk of having too many false alarms. There is also, oddly, additional risk of complacency -- I can look away because the system will beep if there's somebody in front of me.
With speculation that Uber's fatality was the result of their system classifying the pedestrian as a false-positive (which is to say a sensor ghost that you don't want to brake for) it could be reasonable to have systems make an audible signal whenever they are deciding to ignore what they think is a sensor ghost.
More off road testing before getting on the road with safety drivers
Some will argue that developers should do more off road testing and really get their numbers up before getting on the road. That's probably not true for the same reason we allow cruise control -- when the system needs lots of intervention, we handle working with it fine. It may be there is a "valley of danger" where the system gets good enough to cause complacency, even with professionals, before it gets good enough to not need supervision at all.
There is no question that safety driver operation will and must happen. Even if you could train a car to what you believed was deployment-ready performance in simulator and on test tracks, nobody is going to trust it for deployment without a good record demonstrated on the road, and that has to be done with safety drivers.
The example of cruise control tells us that no system is too poor to deploy with safety drivers. Ironically, it can be argued that some systems might be too good. Perhaps there is a zone of "good" which is somewhere between "cruise control" and "Excellent." Cruise control is acceptable. Super good is acceptable. Yet somehow, "good" might not be.
If that's true, and if Uber was in that zone, the first approach will be to improve safety drivers with monitoring and more team driving. Another approach, just becoming available to us today, is better simulator technology.
It is possible to have a human being drive around in a car with the full sensor suite, and to take the logs and feed them to a self driving system. You can then see if the system would make any decisions that differ greatly from what the human driver did, and look at them for the cause. This can help, but is very limited. You really only get to look at the instantaneous decision making at any point, based on information up to there. It's not possible to learn any dynamic properties of the system, because if it decides to steer or accelerate differently than the human has done, we can no longer look at what a real system would have done on the road.
You can take this data and turn it into simulator scenarios, and then have the system try to drive in those. That is in fact what is done to build simulator scenarios in many cases. This is difficult, and the result is still not very satisfying, especially if what you want to understand is sensor performance. Simulating sensors is much more difficult and slow than simulating situations, and the reality is that simulating radar is almost impossible, and simulating LIDAR perfectly is also close to impossible. Simulating camera views looks like a good quality video game. Usable, but not the real world, and often too far off the mark.
Still, as I have discussed, there could be great merit in the world building a vast library of simulated scenarios based on the experience of all teams. This would allow a great deal more testing of unusual situations before the cars first go onto the road with safety drivers. But they must go onto the road with safety drivers.