NHTSA Regulations part 3: Data Sharing, Privacy, Safety, Security and HMI
After my initial reactions and Overall Analysis here is a point by point consideration of the elements from NHTSA's 15 point certification list for robocars. See also the second half and the whole series
Let's dig in:
Data Recording and Sharing
These regulations require a plan about how the vehicle keep logs around any incident (while following privacy rules.) This is something everybody already does -- in fact they keep logs of everything for now -- since they want to debug any problems they encounter. NHTSA wants the logs to be available to NHTSA for crash investigation.
NHTSA also wants recordings of positive events (the system avoided a problem.)
Most interesting is a requirement for a data sharing plan. NHTSA wants companies to share their logs with their competitors in the event of incidents and important non-incidents, like near misses or detection of difficult objects.
This is perhaps the most interesting element of the plan, but it has seen some resistance from vendors. And it is indeed something that might not happen at scale without regulation. Many teams will consider their set of test data to be part of their crown jewels. Such test data is only gathered by spending many millions of dollars to send drivers out on the roads, or by convincing customers or others to voluntarily supervise while their cars gather test data, as Tesla has done. A large part of the head-start that leaders have in this field is the amount of different road situations they have been able to expose their vehicles to. Recordings of mundane driving activity are less exciting and will be easier to gather. Real world incidents are rare and gold for testing. The sharing is not as golden, because each vehicle will have different sensors, located in different places, so it will not be easy to adapt logs from one vehicle directly to another. While a vehicle system can play its own raw logs back directly to see how it performs in the same situation, other vehicles won't readily do that.
Instead this offers the ability to build something that all vendors want and need, and the world needs, which is a high quality simulator where cars can be tested against real world recordings and entirely synthetic events. The data sharing requirement will allow the input of all these situations into the simulator, so every car can test how it would have performed. This simulation will mostly be at the "post perception level" where the car has (roughly) identified all the things on the road and is figuring out what to do with them, but some simulation could be done at lower levels.
These data logs and simulator scenarios will create what is known as a regression test suite. You test your car in all the situations, and every time you modify the software, you test that your modifications didn't break something that used to work. It's an essential tool.
In the history of software, there have been shared public test suites (often sourced from academia) and private ones that are closely guarded. For some time, I have proposed that it might be very useful if there were a a public and open source simulator environment which all teams could contribute scenarios to, but I always expected most contributions would come from academics and the open source community. Without this rule, the teams with the most test miles under their belts might be less willing to contribute.
Such a simulator would help all teams and level the playing field. It would allow small innovators to even build and test prototype ideas entirely in simulator, with very low cost and zero risk compared to building it in physical hardware.
This is a great example of where NHTSA could use its money rather than its regulatory power to improve safety, by funding the development of such test tools. In fact, if done open source, the agencies and academic institutions of the world could fund a global one. (This would face opposition from companies hoping to sell test tools, but there will still be openings for proprietary test tools.)
The requirement for user choice is an interesting one, and it conflicts with the logging requirements. People are wary of technology that will betray them in court. Of course, as long as the car is not a hybrid car that mixes human driving with self-driving, and the passenger is not liable in an accident, there should be minimal risk to the passenger from accidents being recorded.
The rules require that personal information be scrubbed from any published data. This is a good idea but history shows it is remarkably hard to do properly. During the test phase, the reality is that the cars are going to log everything -- even in some cases the interior of the passenger cabin -- and only employees and designated beta testers will be riding in the vehicles and consenting. There is not going to be a lot to do about privacy in that phase, though it is important in production vehicles. In fact, I would say we need to go beyond privacy policies when we get there, and see if we want to preserve things like the right to anonymous travel in a taxi.
Everybody is highly focused on safety, which is why you don't need to tell them that, and most importantly, don't need to tell them how to do it. This section calls for developers to follow ISO and SAE guidance, best practices, standards and design principles. It also calls upon the use of similar approaches from aviation, space and the military.
This section also encourages auto companies to track AI and machine learning, even though the use of these can be quite at odds with other parts of these regulations, and the standards specified.
There is nothing precisely wrong with these existing standards and practices. The problem is that they are conventional. To illustrate this, consider a story from Google's early history, which has nothing to do with self-driving cars.
In its earliest days, Google needed to quickly grow its computing capacity. All standards and conventional wisdom of the time dictated that a mission critical computing facility would be based on top quality components, "enterprise" grade motherboards and hard drives. All these components were high quality but also high price.
Google instead built their server farms out of the cheapest parts they could find. The first server racks were literally made out of wood with motherboards stacked on blocks. They knew those parts were unreliable, so they designed their systems to gracefully deal with the failure.
When Google's parts failed, other systems took over without customers noticing. Instead, maintenance crew would go through from time to time, replacing dead components as needed, in no great hurry. They built what is clearly one of the most reliable computer facilities in the world by flaunting conventional wisdom and standards.
The story of the car won't be identical to this, but it is highly likely that developers will conceive of entirely new ways to create safety and reliability which also fly in the face of conventional wisdom and best practices. Smart teams will indeed know the existing practices, but they may decide they can do even better by doing it in a different way.
It would be a shame if they were afraid to do that because of fear they could be called to account for not following some rule from an SAE or ISO standard.
As in several other places, the use of machine learning is at odds with many of these old techniques. The performance of machine learning systems can be difficult to predict and even duplicate. There may not be individual subsystems to be test and validated on their own -- the system may only work as a whole.
We do indeed need all cars to practice good computer security. (One of the first things they will do is ignore the DoT's own efforts to get them to network with random other vehicles they encounter on the road using V2V communications.)
The same comments about existing best practices apply. Teams should indeed rely on all we know about how to do computer security (though standards bodies are not tremendously good at this) but once again it should be clear that in many cases, entirely new approaches which supplant conventional wisdom will apply.
The security problem here is huge. Vehicles should be paranoid about anything -- even their own HQ, and their own sensors. This may be one of the greatest security challenges there is.
The DoT has been so committed to its programs for vehicle-to-vehicle (V2V) communication that it has become blinded to their security risk. That risk far surpasses the benefits of V2V and most V2I communication for the next decade. We should make sure the DoT does not attempt to put in any regulations demanding the use of V2V or V2I, and let vendors decide on their own if it is valuable enough and secure enough to do that.
Human Machine Interface
Once again, this section mostly tells developers what they already know. It should be said more explicitly (it is implied) that this relates only to safety-oriented HMI, as the government should not regulate other types of HMI.
There is value in standardization of certain elements of UI -- particularly UI with pedestrians or passengers in taxis -- but much more experimentation is needed before the good practices can be figured out.
The requirement to accommodate disabilities is worthwhile, but it is also important to note that in a self driving taxi fleet, it is not necessary that every vehicle accommodate the disabled, though all should if it's easy. When it comes to robotaxis, the disabled can indicate their disability in their profile when they summon a vehicle, and be sure to get one which accommodates that.
There is a requirement for delivery vehicles here that the remote dispatch center be able to know the status of of the vehicle at all times. Such a requirement would prohibit operation of these vehicles in any wireless signal dead zones. That might be a good idea -- but it's much too early to tell if that's true. Low speed delivery vehicles might be perfectly safe making short crossings in zones for which they are certified safe but can't call home.
See the second half of the 15 point list for more comments.