Recently we’ve seen two essays by people I highly respect in the field of AI and robotics. Their points are worthy of reading, but in spite of my respect, I have some differences of course.
The first essay comes from Andrew Ng, head of AI (and thus the self-driving car project) at Baidu. You will find few who can compete with Andrew when it comes to expertise on AI. (Update: This essay is not recent, but I only came upon it recently.)
In Wired he writes that Self-Driving Cars Won’t Work Until We Change Our Roads—And Attitudes. And the media have read this essay as being much more strong about changing the roads than he actually writes. I have declared it to be the “first law of robocars” that you don’t change the infrastructure. You improve your car to match the world you are given, you don’t ask the world to change to help your cars. There are several reasons I promote this rule:
- As soon as you depend on a change in the world in order to drive safely, you have vastly limited where you can deploy. You declare that your technology will be, for a very long time, a limited area technology.
- You have to depend on, and wait for others to change the world or their attitudes. It’s beyond your control.
- When it comes to cities and infrastructure, the pace of change is glacial. When it comes to human behaviour, it can be even worse.
- While it may seem that the change to infrastructure is clearer and easier to plan, the reality is almost assuredly the opposite. That’s because the clever teams of developers, armed with the constantly improving technologies driven by Moore’s law, have the ability to solve problems in a way that is much faster than our linear intuitions suggest. Consider measuring traffic by installing tons of sensors, vs. just getting everybody to download Waze. Before Waze, the sensor approach seemed clear, if expensive. But it was wrong.
As noted, Andrew Ng does not actually suggest that much change to the infrastructure. He talks about:
- Having road construction crews log changes to the road before they do them
- Giving police and others who direct traffic a more reliable way to communicate their commands to cars
- Better painting of lane markers
- More reliable ways to learn the state of traffic lights
- Tools to help humans understand the actions and plans of robocars
The first proposal is one I have also made, because it’s very doable, thanks to computer technology. All it requires at first blush is a smartphone app in the hands of construction crews. Before starting a project, they would know that just as important as laying out cones and signs is opening the app and declaring the start of a project. The phone has a GPS and can offer a selection of precise road locations and log it. Of course, the projects should be logged even before they begin, but because that’s imperfect, smartphone logging is good enough. You could improve this by sticking old smartphones in all the road construction machines (old phones are cheap and there are only so many machines) so that any time a machine stops on a road for very long, it sends a message to a control center. Even emergency construction gets detected this way.
Even with all that, cars still need to detect changes to the road (that’s easy with good maps) and cones and machines. Which they can do.
I think the redirection problem is more difficult. Many people redirect traffic, even civilians. However, I would be interested to see Ng’s prediction on how hard it is to get neural network based recognizers to understand all the common gestures. Considering that computers are now getting better at reading sign languages, which are much more complex, I am optimistic here. But in any event, there is another solution for the cases where the system can’t understand the advice, namely calling in an operator in a remote control center, which is what Nissan plans to do, and what we do at Starship. Unmanned cars, with no human to help, will just avoid data dead zones. If somehow they get to them, there can be other solutions, which are imperfect but fine when the problem is very rare, such as a way for the traffic manager to speak to the car (after all, spoken language understanding is now close to a solved problem for limited vocabulary problems.)
Here I disagree with Andrew. His statement may be a result of efforts to drive on roads without maps, even though Baidu has good map expertise. Google’s car has a map of the texture of the road. It knows where the cracks and jagged lane markers are. The car actually likes degrading lane markers. It’s perfectly painted straight and smooth roads which confuse it (though only slightly, and not enough to cause a problem.) So no, I think that better line painting is not on the must-do list.
He’s right, seeing lights can be challenging, though the better cars are getting good at it. The simple algorithm is “you don’t go if you don’t confirm green.” That means you don’t run a red but you could block traffic. If that’s very rare it’s OK. We can consider infrastructure to solve that, though I’m wary. Fortunately, if the city is controlling its lights with a central computer, you don’t have to alter the traffic light itself (which is hard,) you can just query the city, in those rare cases, for when the light will be changing. I think that problem will be solved, but I also think it may well be solved just by better cameras. Good robocars know exactly where all the lights are, and they know where they are, and thus they know exactly what pixels in a video image are from the light, even if the sun is behind it. (Good robocars also know where the sun is and will avoid stopping in a place where there is no light they can see without the sun right behind it.)
Working with people
How cars interact with people is one of Andrew Ng’s points and the central point of Rodney Brooks’ essay Unexpected Consequences of Self Driving Cars. Already many of the car companies have had fun experimenting with that, putting displays on the outside of cars of various sorts. While cars don’t have the body language and eye contact of human drivers, I don’t predict a problem we can’t solve with good effort.
Brooks’ credentials are also superb, as founder of iRobot (Roomba) and Rethink Robotics (Baxter) as well as many accomplishments as an MIT professor. His essay delves into one of the key questions I have wondered about for some time — how to deal with a world where things do not follow the rules, and where there are lots of implicit and changing rules and interactions. Google discovered the first instant of this when their car got stuck at a 4 way stop by being polite. They had to program the car to assert its right to go in order to handle the stop. Likewise, you need to speed to be a good citizen on many of our roads today.
His key points are as follows:
- There is a well worked out dance between pedestrians and cars, that varies greatly among different road types, with give and take, and it’s not suitable for machines yet.
- People want to know a driver has seen them before stepping near or certainly in front of a vehicle.
- People jaywalk, and even expect cars to stop for them when they do on some streets.
- In snowy places, people walk on the street when the sidewalk is not shoveled.
- Foot traffic can be so much that timid cars can’t ever get out of sidestreets or driveways. Nice pedestrians often let them out. They will hand signal their willingness to yield or use body language.
- Sometimes people just stand at the corner or edge of the road, and you can’t tell if they are standing there or getting ready to cross.
- People setting cars to circle rather than park
- People might jump out of their car to do something, leaving it in the middle of the street blocking traffic, where today they would be unwilling to double park.
- People might abuse parking spots by having a car “hold” them for quick service when they want to leave an event.
- Cars will grab early spots to pick up children at schools.
Brooks starts with one common mistake — he has bought into the “levels” defined by SAE, even claiming them to be well accepted. In fact, many people don’t accept them, especially the most advanced developers, and I outlined recently why there is only one level, namely unmanned operation, and so the levels are useless as a taxonomy. Instead the real taxonomy in the early days will be the difference between mobility on demand services (robotaxi) and self-drive enabled high end luxury cars. Many of his problems involve privately owned cars and selfish behaviour by their owners. Many of those behaviours don’t make sense in a world with robotaxis. I think it’s very likely that the robotaxis come first, and come in large numbers first, while some imagine it’s the other way around.
Brooks is right that there will be unintended consequences, and the technology will be put to uses nobody thought of. People will be greedy, and antisocial, that can be assured. Fortunately, however, people will work out solutions, in advance, to anything you can think of or notice just by walking down the street or thinking about issues for a few days. The experienced developers have been thinking about these problems for decades now, and cars like Google’s have driven for 300 human lifetimes of driving, and that number keeps increasing. They note every unusual situation they encounter on every road they can try to drive, and the put it into the simulator if it’s important. They’ve already seen more situations than any one human will encounter on those roads, though they certainly haven’t driven all the types of road in the world. But they will, before they certify as safe for deployment on such roads.
As I noted, only the “level 4” situation is real. Level 5 is an aspirational science-fiction goal, and the others are unsafe. Key to the improved thinking on “levels” it is no longer the amount of human supervision needed that makes the difference, it is the types of roads and situations you can handle. All these vehicles will only handle a subset of roads, and that is what everybody plans. If there is a road that is too hard, they just won’t drive it. Fortunately, there are lots of road subsets out there that are very, very useful and make economic sense. For a while, many companies planned only to do highways, which are the simplest road subset of all, except for the speed. A small subset, but everybody agrees it’s valuable.
So the short answer is, solutions will be found to these problems if the roads they occur on are commercially necessary. If they are not necessary, the solutions will be delayed until they can be found, though that’s probably not too long.
As noted above, many people do expect systems to be developed to allow dialogue between robocars and pedestrians or other humans. One useful tool is gaze detection — just as a cheap flash camera causes “red eye” in photos, machines shining infrared light can easily tell if you are looking at them. Eye contact in that direction is detectable. There have been various experiments in sending information in the reverse direction. Some cars have lasers that can paint lines on the road. Others can display text. Some have an LED ribbon surrounding them that shows all the objects and people tracked by the car, so people can understand that they are being perceived. You can also flash a light back directly at people to return their eye contact — I see you and I see that you saw me.
Over time, we’ll develop styles of communication, and they will get standarized. It’s not essential to do that on day one; you just stay on the simpler roads until you know you can handle the others. Private cars will pause and pop out a steering wheel. Services like Uber will send you a human driver in the early days if the car is going somewhere the systems can’t drive, or they might even let you drive part of it. Such incrementalism is the only way it can ever work.
People taking advantage of timidity of robocars
I believe there are solutions to some of the problems laid out. One I have considered is pedestrians and others who take advantage of the naturally conservative and timid nature of a robocar. If people feel they can safely cut off or jaywalk in front of robocars, they will. And the unmanned cars will mostly just accept that, though only about 10% of all cars should be unmanned at any given time. The cars with passengers are another story. Those passengers will be bothered if they are cut off, or forced to brake quickly. They will spill their coffee. And they will fight back.
Citizen based strong traffic code enforcement
Every time you jump in front of such a car, it will of course have saved the video and other sensor data. It’s always doing that. But the passenger might tell the car, “Please save that recent encounter. E-mail it to the police.” The police will do little with it at first, but in time, especially since there are rich people in these cars, they will throw a face recognizer and licence plate recognizer on the system that gets the videos. They will notice that one person keeps jaywalking right in front of the cars and annoying the passengers. Or the guy who keeps cutting off the cars as though they are not there because they always brake. They will have video of him doing it 40 times, or 100. And at that point, they will do something. The worst offender will get identified and get an E-mail from police. We have 50 videos of you doing this. Here are 50 tickets. Then the next, and the next until nobody wants to get to the top of the list.
This might actually create pressure the other way — a street that belongs only to the cars and excludes the non-car user. A traffic code that is enforced to the letter because every person inconvenienced has an ability to file a complaint trivially. We don’t want that either, but we can control that balance.
I actually look forward to fixing one of the dynamics of jaywalking that doesn’t work. Often, people like to jaywalk and a car is approaching. They want to have the car pass at full speed and then walk behind it — everybody is more comfortable behind a car than in front of one. But the driver gets paranoid and stops, and eventually you uncomfortably cross in front, annoyed at that and that you stopped somebody you didn’t intend to stop. I suspect robocars will be able to handle this dynamic better, predicting when people might actually be on a path to enter their lane, but not slowing down for stopped pedestrians (adults at least) and trust them to manage their crossing. Children are a different matter.
People being selfish with robocars
Brooks wonders about people doing selfish things with their robocars. Here, he mostly talks about privately owned robocars, since most of what he describes would not or could not happen with a robotaxi. There will be some private cars so we want to think about this.
A very common supposition I see here and elsewhere is the idea of a car that circles rather than parking. Today, operating a car is about $20/hour so that’s already completely irrational, and even when robocar operation drops to $8/hour or less, parking is going to be ridiculously cheap and plentiful so that’s not too likely. There could be competition for spots in very busy areas (schools, arenas etc.) which don’t have much space for pick-up and drop-off, and that’s another area where a bit of traffic code could go a long way. Allow facilities to make a rule: “No car may enter unless its passenger is waiting at the pick-up spot” with authority to ticket and evict any car that does otherwise. Over time, such locations will adjust their pick-up spots to the robocar world and become more like Singapore’s airport, which provides amazing taxi throughput with no cab lines by making it all happen in parallel. Of course, cars would wait outside the zone but robocars can easily double and triple park without blocking the cars they sit in the path of. Robocars waiting for passengers at busy locations will be able to purchase waiting spaces for less than the cost of circling, and then serve their customers or owners. If necessary, market prices can be put on the prized close waiting spaces to solve any problems of scarcity.
So when can it happen?
Robocars will come to different places at different times. They will handle different classes of streets at different times. They will handle different types of interactions with pedestrians and other road users at different times. Where you live will dictate when you can use it and how you can use it. Vendors will push at the most lucrative routes to start, then work down. There will be many problems that are difficult at first, and the result will be the early cars just don’t go on those sorts of streets or into those sorts of situations. Human driving, either by the customer or something like an Uber driver, will fill in the gaps.
Long before then, teams will have encountered or thought of just about any situation you’ve seen, and any situation you’ve likely thought of in a short amount of time. They will have programmed every variation of that situation they can imagine into their simulators to see what their car does. They will use this to grow the network of roads the cars handle every day. Even if at the start, it is not a network of use to you, it won’t be too long before it becomes that, at first for some of your rides, and eventually for most or all.