On voting, sampling, measurement, elections and surveys
Yesterday's post about the flaws in the so-called "popular vote" certainly triggered some debate (mostly on Facebook.) To clarify matters, I thought I would dive a little deeper about what the two types of Presidential elections in the USA are so different they can't be added together in a way that isn't misleading.
These matters are studied both by statisticians, who focus on the science of measurement, particularly of things about groups, and election theorists, who also are interested in that but add the study of votes/polls which do not deliberately sample a subset of a population, but attempt to consider the will of the entire group. Both of them are highly concerned about how to deal with the fact a substantial fraction of the population may not participate.
One way to look at the difference is to consider this: An election is not supposed to be just a measurement. It is that, but more than that it is an action. It is the actual enactment of the will of the voters. While there are government officials who count the votes and report on them, a person is not put into office by those officials. Rather, it is the voters who put the candidate into office through their votes. (In Canada, it's different. The Queen and her Governor-General technically have the legal power, and they observe how the people voted and invite the winner to form a government in the Queen's name.)
Because voting is an act, rather than just an expression of opinion, we have come to deal with the non-participators as still acting. By not registering to vote or not showing up, they have still taken an action; they have deferred to the others to select the winner.
We tolerate this, though we don't like it. Low turnouts reduce confidence in the results, and they also mean that election results can be more easily manipulated through "get out the vote" efforts. On the other hand, we get quite upset when people don't vote for other reasons outside their own will, particularly if somebody else impeded their ability to vote, or manipulated them into not voting. Both voting and not voting must be acts of the free person.
Election theorists join with statisticians in some ways. All are interested in making sure that the aggregate will that comes from counting the votes most accurately reflects the aggregate will of the voters. We debate the merits of different counting systems. Many feel that multi-candidate ballots/preferential ballots do a much better job than first-past-the-post plurality systems. But in all case the counting system is simply the means of calculating the voters' will so it can be enacted.
In the US Presidential elections, in spite of what is written on the ballot, the voters are appointing a slate of members of the electoral college. This is done independently in each state. In the swing states, all is as you would expect. Candidates campaign. Major efforts are made to woo voters and to get voters to come out. Voters go to the polls knowing and expecting that their will shall be done. They expect they might be part of the group which gets to designate the slate of electors.
In the safe states, it's very different. In these states, who the electors will be is already well established from polls and the historical patterns of the state. The voters will picks the electors, but it's a foregone conclusion. Nobody campaigns. There are no major efforts to get out the vote. There will be other races on the ballots which will bring out voters, who will vote within the known constraints. A decent chunk of voters will also show up because "this is how we do things" and together the knowledge that this will happen seals the fate of the state. On top of that, in the safe states, one knows that if things got so far outside the predicted norms as to make the vote actually close, then long ago the election will already have gone to the unexpected party, which in that situation will win all the swing states and victory. This is particularly true on the west coast, where the result is almost always decided before the polls close, and will certainly be decided long before that in a strange situation. If today's California came close to going Republican, the rest of the USA would also be going so Republican that California's shift can't matter.
People know this, and this makes a big difference. A vote in California is technically an action, but only technically. It's technically a vote but that's an illusion. In reality, it can never change the result. It's only for show. The candidates know it too. Because of that a lot of people don't even register, and a lot stay home. The vote in California is not an election, but only a measurement. A survey. All it ever does is change the number printed in the paper.
Statisticians know all about surveys. They can be pretty good at measuring aggregate opinion if done well, but it is hard to do them well. The problem is what we call sampling bias. In an election, not voting is an implicit action. In a survey, not participating is just not participating. When there is nothing to gain or lose from participating or not participating, the motivations are different.
In 2016, the average swing state Presidential turnout was 64.6% of eligible voters. California's turnout was 56.1%, just under the 56.6% average of the safe states. In Hawai`i, which knows the election is always decided before it votes (pretty much always for Democrats) the turnout was 41.7% A lot of people don't show up.
This turns the safe-state votes into something closer to a self-selected survey. Millions are not voting, and those who are voting do so for other reasons than to enact their will. The self-selected survey is the most common class of what is also called the "non-scientific survey." The name is intended to be derisive. It is easy to jump to false conclusions from a self-selected survey.
It isn't that simple of course. The vote in safe states is a mix of actual polling and self-selection. As noted, there are people coming to vote on other races. We know how many of those there are. Turnout in off-year elections is around 40%, sometimes worse. And, as we can see, a lot of people show up because there is a Presidential race, in spite of the lack of power in their votes. Some do it from duty. Some from the excitement of a Presidential race. Many do not understand the impotence of their vote, and certainly many do not look at it the way it is described in this article, with a statistician's eye. So many are voting as though their vote counted. Many have studied the race in detail, as though their vote counted. I can't even vote and I study it as deeply as any.
But some vote very differently because they know their vote lacks power. Around 9 million don't vote at all, who would have voted if they were in swing states. Almost surely many millions of those who do vote will do it differently than they might if their vote counted. But there is also no denying that a considerable majority of the voters are treating their vote as just as real, voting just as they would if it could change things. But a considerable majority is not enough. As long as a large group -- even if it's a small minority, even just 5% -- are altering or withdrawing their votes, the total loses scientific validity, and has much larger error bars on it.
It is worth noting that by the normal definitions of a popular vote election, it is invalid to add the results of two distinct elections. There is no question that the Presidential elector selections of each state are distinct elections, run by the states. Even on that grounds you can't add them and treat it as a popular vote. Because ballots replace the actual candidates (slates of electors pledged to the candidates) with the names of candidates, it makes people forget that they are two distinct elections. Thus it becomes necessary to understand how they are not just distinct because they are in different states, but because they operate on different principles as well.
This is why I wrote that, in spite of the fact that it is possible to sum up the votes cast in the 51 different electoral college contests and call it the popular vote, it is nonsensical to do so. You can't add the totals from people who were voting with the full power of voters in a popular vote election to the totals from people who were participating in a voluntary survey. Aside from the real accuracy problems of the latter class, they are just different things. They can be added on a calculator, but to do so is to announce a misleading number, a meaningless one. You can call it "the popular vote" but it is not like a real popular vote, the kind used in all the other elections of the USA and most of the rest around the world. Calling it the popular vote makes many people -- we've seen this -- think it has a winner and a loser. They think it has meaning. They think it supports or questions the legitimacy of the winner of the electoral college. Since real popular votes are, in our modern democratic world, seen as superior to systems like the electoral college, calling it "the popular vote" implies to many people that it is superior, when in fact it's meaningless. It would only be superior if it were an actual popular vote election like the others.