Waymo is first, but is Cruise second, and how can you tell?
A recent Reuters story suggests Cruise is well behind schedule with one insider saying "nothing is on schedule" and various reports of problems not yet handled. This puts doubt into GM's announced plan to have a commercial pilot without safety drivers in operation in San Francisco in 2019.
The problem for me, and everybody else, is that it's very hard to judge the progress of a project from outside. This is because it's "easy" to get a basic car together and do demo runs on various streets. Teams usually have something like that up and running within a year. Just 2 years in, Google had logged 100,000 miles on 1,000 different miles of road. Today, it's even easier.
As such, you can see lots of miles logged, and you can take a test ride where nothing eventful happens, and be impressed, but learn very little.
You learn more if you can get detailed statistics, but even there it's difficult. Only the people on the team in charge of measuring quality have a really good sense of it, and sometimes not even them. Even high quality vehicles will be having minor problems from time to time, temporarily not perceiving things, or being too conservative in their driving. After all, humans have tons of minor problems all the time. We're always temporarily not paying attention, missing minor things, drifting in lanes, hard braking for surprises and more, but we still make the bar.
This is one reason I declared Waymo's (very limited) deployment with no safety drivers as the big story of 2017. To do that, their team had to get the lawyers and board to sign off on that risky decision. I wasn't there, but I will guess that the Alphabet board itself had to sign off on that decision. Which means they had to have a pretty good case.
As such, Cruise's announced plan for 2019 was the 2nd boldest plan announced with any credibility. (I have never given that much credit to Elon Musk's announcements on Tesla full self driving, and even Tesla has dialed them back. Elon's a good guy but he is a little fast and loose with the predictions.)
I believe that before Cruise actually deploys like this it will probably require a GM board level approval, but possibly not. GM's name and reputation will be on the line.
As such, the most compelling information about how confident a team is in their vehicle is real "put your money and reputation on the line" moves. Of course, the smaller startups do not have the same sort of reputations to put on the line.
Since there are not very many such declarations, we must look for other things. The sheer amount of driving is one metric. The reported disengagements required by California are a popular thing to look at, but each company has a different methodology about how to count them and what they mean.
In addition, I and other skilled people can make judgements by getting a chance to get to really know a team, and ride in the car while watching un-canned visualizations of the quality of its perception and other systems. We often see perception videos released (with bounding boxes and labels on obstacles) but these are usually vetted in advance. I have at times done just this with some of the cars I have ridden in. Perhaps I or others should start an independent service verifying claims. The problem, of course, is that the team should know its own numbers already. Since the planned regimen for regulation is self-certification, that's what will matter most.
Cruise's 2019 date is ambitious, as they are half the age of Waymo. Most other forecast dates are in the 2020s and not very real -- they are just aspirations, and were not signed off on after some specific process. And we should expect any dates we get to be aspirations, "We hope we could do it by such-and-such date." In reality, it is likely to slip, but nobody knows how much it will slip.
Toyota has set a hard deadline with the planned service for the Olympics, but while they can't move the date of the games, they can change the parameters of the service. I think Waymo changed their parameters after the Uber fatality, which used up a lot of the public's tolerance for error in Phoenix.
This is related to, but different from, the core question of a metric for proving safety which I have discussed a number of times. While the metric may be the final step to certifying a car for operation on the road, you don't need the final metric to get a sense of where teams are.