New NTSB report is out -- with damning information on Uber's fatal robocar crash


508 ring>tmp$

In advance of the full hearing, an early NTSB report on the fatal Uber robocar crash last year is out. It contains some important new details, including the fact that the Uber system did not consider the possibility of pedestrians outside of crosswalks, and also kept throwing away past trajectory data on obstacles every time it reclassified them. Not able to classify Elaine Herzberg as a pedestrian, it constantly reclassified her and thus failed to track her, thus not realizing the problem until it was too late.

I have a new and very detailed analysis of the accident on the Forbes site, you can read it at:

New NTSB reports reveal major errors by Uber


Great analysis. I am confused about the cameras. The report says that there were 10 cameras; that they were used for object identification; that the pedestrian can be seen clearly and that you can tell the bike had no reflectors, she did not look towards the vehicle etc. But Table 1 shows no camera identification at all. This seems like yet another system failure?

It says the near range cameras were not in use, but nothing else.

To see in this lighting environment, you would want HDR cameras, which it doesn't say they do. Without HDR, she would probably be pretty dark until in the streetlight. No mention of classifier using the camera.

A guy cruised N. Mill Avenue a few days after the incident. Impact point is at 35 seconds (not 33 as his comment says): video

He never replied to questions about type of camera or low light setting, but I've heard others say it's well lit and this video is much closer to what a driver sees than the extremely dark Uber dashcam video released shortly after the crash. Overhead illumination is significant, note how the headlight pattern on the road is barely visible in some areas.

It's hard to believe Uber's cameras couldn't detect a pedestrian with this much overhead lighting.

It was decently lit and a human and properly exposed camera can see things. However, a basic camera set to either show the shadows or show the highlights could fail on the other. This is why HDR is now considered important.

Don't they test these things by creating situations that might be difficult to analyze? Seems like they didn't try hard enough at that phase of development. Also, no expecting pedestrians outside of crosswalks is ludicrous. Sooner or later, there will be a pedestrian crossing at almost every point of a street. There are even pedestrians crossing freeways. I have seen pedestrians cross just thirty feet from a crosswalk at a light. No good will come of trying to predict pedestrian behavior.

Teams are doing aggressive testing. Uber's car was much more immature than would be expected. You do try to predict pedestrians. If you don't, you can't drive, and humans do it all the time.

”Uber’s system for classifying things on the road acted as though pedestrians only appear in crosswalks”

On what do you base this claim?

The report says, for example: ”pedestrians outside a vicinity of a crosswalk are also not assigned an explicit goal. However, they may be predicted a trajectory based on the observed velocities, when continually detected as a pedestrian”

This implies that objects can be classified as pedestrians outside of crosswalks.

I think many have concluded this because the system kept refusing to classify her as a pedestrian in her location

"The system never classified her has a pedestrian ... the system desisgn did not include a consideration for jaywalking pedestrians."

This discussion is now moot, which is nice – I knew there was something fishy about this!

Actually there was an even more unambiguous wording in the report: ”According to Uber ATG, the SDS did not have the capability to classify an object as a pedestrian unless that object was near a crosswalk”

On the other hand, at the meeting yesterday Ensar Becic just as unambiguously stated: ”at the time of the crash the AVS could detect pedestrians but it would not give them an inherent goal of jaywalking”.

This does not seem consistent. I think we have to consider that the NTSB staff got some stuff wrong in their report.

I am not sure that we still actually know what was going in inside that ADS. I have some theories, but I don't want to add to the confusion. At least this is true: detecting a pedestrian — no quotation marks — and attaching the label ”pedestrian” to an object are two different things.

What would the usual classification of a person leading a bike at a crosswalk be — a ”bike” or a ”pedestrian”? From a pattern matching perspective I think ”bike” sounds like the category that would get the higher score of those two options. A ”bike” would never be just a bike. It would be a bike and a person sitting on the bike or leading it. It would not be ok to run over it.

In NTSB:s report Elaine is always (?) refered to as a ”pedestrian” but a self driving system might attach to her the label ”bike” – and that could be ok, I think.

Anyway, the reclassification and resets was the real problem. As a pedestrian I would not mind being classified as a ”bike”, an ”other” or a ”Chesterfield sofa”, as long as the ADS keeps track of me and does not run me over.

That's why I call it cyclist. A pedestrian walking a bike might be considered a different thing. They move at the speed of a pedestrian but are much less likely to be able to change direction instantly, though they can still do it by picking up the bike. Pedestrians walking bikes do not tend to appear in the traffic lanes the way cyclists do, but the way pedestrians do. So they act 95% like pedestrians. It seems likely that Uber did not have pedestrian-walking-bike as a classification.

I think the answer given in questions is the correct answer. That uber would not identify a pedestrian outside a crosswalk was a very strange idea. That it would not assign the road crossing goal outside the crosswalk makes more sense, even though it's also wrong.

Over-reliance on maps?

Over-reliance on bad maps, at that. If you're going to map crosswalks, you should also map non-crosswalks where people frequently cross.

Maybe they didn't even try to map crosswalks. Does it say in the report?

Legally there is a crosswalk at every intersection, even if not marked, unless crossing is explicitly forbidden. So it's not the kind of thing that changes often in maps.

Mapping places where people are likely to cross is not a bad idea (and exactly the sort of thing only mapping is likely to give you with reliability) but it remains true that you have the right of way there and pedestrians are expected to yield to you, and you are expected to yield to pedestrians in crosswalks.

If you see a pedestrian outside a crosswalk, of course you must work to not hit them. But drivers pay attention to people on sidewalks who appear to be about to enter a crosswalk, while we largely ignore people walking on the side of the road outside crosswalks or on the sidewalk there, unless they give strong signals of crossing.

Looking at driving in terms of who has the right of way is almost always the wrong type of thinking. Particularly when dealing with pedestrians, but almost always.

Mapping is not bad. Relying on maps is bad. A map makes a great input into a neural network. The neural network can then decide "is this a crosswalk" or not. Sure, if the map says there's a crosswalk, that evidence that there's a crosswalk. If someone is crossing the roadway there without paying attention to traffic, that's also strong evidence that there's a crosswalk, even if the map says there isn't.

And in the end, it really doesn't matter if there's a crosswalk or not. Sure, there are some cases where you could have software 1.0 rules like "pay attention to people on sidewalks who appear to be about to enter a crosswalk." Or you can feed the map into a neural net along with everything else and ask the software 2.0 question: "How likely is this pedestrian to cross in front of me?"

I'd go with the latter.

MobilEye is arguing, through their RSS approach, that cars should drive within the legal ROW envelope, even though it might be less safe, because this is the way to permit cars the aggression needed to survive in cities like Jerusalem where they are based. Yes, RSS says that ROW is given, not taken, but it also lets you assert it.

The reality though, is we humans treat crosswalks (particularly marked ones) differently than we treat open road, and robots are likely to do the same.

Yes, maps are just one piece of data, and you believe your sensors over a map. (Though it is also possible for a map to say, "don't believe your sensors here" sometimes.)

MobilEye's approach is horribly misguided and utterly dangerous.

Sometimes humans treat marked crosswalks differently than open road. Sometimes they don't. As for unmarked crosswalks, it is rare for humans to even know about them.

Is it useful to map crosswalks? Sure. I'd probably go about it by mapping sidewalks (complete with paved vs unpaved designations), which would implicitly map unmarked crosswalks, and then additionally mapping any crosswalk markings. But the only use for that I can think of would be when approaching a crosswalk from a significant distance where it's too far to see it with the cameras and/or other sensors.

You say you believe your sensors over a map. Then, in parentheses, you point out why that's wrong. You don't believe your sensors over a map or your map over your sensors. Both are factors, the relationship between them is highly complex and situation-dependent, and in my opinion the proper way to deal with this is to feed both into the neural network. Not to hand-code rules about when to believe one and when to believe another.

One, of the several mistakes by Uber, apparently was making related decisions (is this a pedestrian, and is this a crosswalk) in separate decision-making units.

In all of this, I don't see why you need the rules of the road to predict behavior. The rules of the road will be understood implicitly by the neural network. Software 2.0.

(Yes, you need many of the rules of the road, at least in rough form, along with several rules of ethics, in order to make the driving decisions; at least until AI is much more advanced. But your suggestion earlier was that you need the rules of the road to determine whether or not a pedestrian is likely to cross in front of you. No! The rules of the road - the real rules of the road - the de facto rules of the road - are implicit in training the neural network.)

The rules of the road are indeed not just coded into the law. For legal risk reasons, you have to understand which ones are in the law, but that doesn't mean you only understand the law.

We don't yet have machine learning based AI able to understand the written and unwritten rules to the safety level necessary. Perhaps some day we will. Until that day, you need other approaches.

The rules of the road sufficient for safe operation are actually not impossible to encode. Just because they are not written down by the government does not mean you can't write them down.

I'm not sure if we have machine learning based AI able to understand the written and unwritten rules to the safety level necessary yet. But I do think some basic rules should be explicitly coded and enforced for the foreseeable future.

That's not the tricky part, though. The tricky part is how to know what to do when the rules are contradictory.

Is it possible to write down all the laws? Well, I'd say that it isn't, because the way I'd define the law (something is illegal if and only if a judge would say it's illegal) it can only be known probabilistically (for many situations, for instance where a statute is broken but you have an arguable legal excuse, whether or not a judge would find you guilty can only be guessed at). It is possible to build a machine that can make superhuman guesses - a better than human judge of the law. Whether or not that qualifies as "writing down the law," I don't know, but moreover, I'm not sure what the point is. You don't have to be an expert at motor vehicle law in order to drive. A basic understanding of the most important laws is enough, especially when combined with real-world experience about how things actually work in the real world.


Here's an example: A California statute says, "A vehicle shall be driven as nearly as practical entirely within a single lane and shall not be moved from the lane until such movement can be made with reasonable safety."

How do you code that? What does "as nearly as practical" mean? What are the exceptions? What is "reasonable safety"?

I think it's much more feasible to build a neural network to answer those questions than it is to code them by hand. It's still not easy. Self-driving cars are going to be very hard to make. But trying to hand-code these kinds of rules is a fool's errand.

Add new comment