We don't know who won the US popular vote, decent chance it was Clinton


The common statistic reported after the US election was that Clinton "won the popular vote" by around 3 million votes over Trump. This has caused great rancour over the role of the electoral college and has provided a sort of safety valve against the shock Democrats (and others) faced over the Trump victory.

I'm here with concerning analysis, which I offer because it is a mistake on the part of the US left to underestimate the magnitude of Trump's victory, or to imagine it was only because of a flaw in the system which he gamed better than Clinton.

The problem is that the US does not officially have a thing called "the popular vote." That exists nowhere in its rules. There is no popular election of the President. Rather, there 54 elections with popular votes in 51 jurisdictions, which newspaper reporters then sum up into a number they incorrectly describe as "the national popular vote." Of course, Clinton did win that invalid sum by around 3M votes. But bad statistical practice by the press, though it has created a common convention -- for many decades -- of calling that number "the popular vote," does not make it valid. True popular votes involve all voters being free and equal, and we criticise any foreign election that pretends to call itself a popular vote when the voters are not free and equal. A popular vote, by its proper definition, is the vote total in a single election. Not 54 of them. As such, the sum is no more a popular vote total than adding the results of the 2008 and 2012 votes would get you a popular vote for or against Obama.

It's especially invalid because it's really summing two fairly different types of results.

  1. True Popular vote totals from "swing" states where both candidates actively campaigned, turnout was higher, and voters expected their votes to count
  2. Low-accuracy popular vote totals from "safe states" which candidates did not contest, and where voters knew their vote would not change the result

Statisticians will tell you these are two very different animals. We probably wish we knew who would have won the popular vote, if there had been a real national popular vote. Because there was no such vote, the hard answer is we don't know what its result would be. In particular, with a statistically invalid sum like the published national popular vote, it is incorrect to say one party "won" or "lost." There is no actual contest to win or lose, and while you can pretend that a higher total is winning, it is not a mathematically valid conclusion.

We do know that in the 16 contested regions, Trump surpassed Clinton in a simple sum by about 500,000 votes. (As you would expect, since he needed to win the swing states to win the college.) In the uncontested states, where the Presidential choice was closer to a self-selected survey than a vote, a sum of those popular votes has her about 3.4M more than Trump. While you can't add popular votes, each popular vote is a statistic, and you can combine statistics if you follow correct statistical procedures.

There are many factors which will introduce error into the results from non-contested states, making it harder to figure out what the actual popular vote might have been.

  • Voters knew their votes didn't matter. Many stayed home; these states had generally lower voter turnout. The states with the lowest turnout (HI, WV, TN, TX, OK, AR, AZ, NM, MS, NY, CA, IN, UT) were generally safe states with large margins. Average turnout in 16 contested states was 65%, in non-contested states 57%.
  • To get specific, a rough calculation suggests 8 to 9 million more votes would be cast in the non-contested states if they had a 65% turnout. This is a giant disenfranchisement.
  • The two candidates had the lowest approval ratings ever. Many Clinton voters were not supporting her, but were out to stop Trump. Trump's ratings were even lower, so many of his voters were only out to stop Clinton. I suggest that in states where you know your vote will not elect or stop anybody, there is less motivation for nose-holding votes.
  • As noted, campaigns were not active in these states. In some states, like California, Clinton did campaign, though presumably to raise money rather than votes. Having only one candidate campaign skews things more.
  • More safe state voters felt comfortable voting for 3rd party choices, which they would have been less likely to do in a swing state. Many of the 4.6M votes for 3rd party candidates in safe states may have gone to major party candidates, though in what direction is unknown.
  • In some safe states, even the downballot races are predetermined, discouraging voters. In California, the election of Democrats in most down-ballot races was assured; the primary was the real contest. (However, contentious ballot propositions can counter this in some states.)

In the end, though, results from a race that everybody agreed didn't matter are just a different animal from results in a contested race. You can't add apples and oranges, or perhaps more correctly, oranges and lemons. Different, though not entirely. You can add them and get a total number of citrus (votes of any kind,) but you can't call it the count of oranges (real votes.)

In spite of the frequent description of the US vote-total as a popular vote, this is at odds with common usage. The thousands of other elections in the USA are actual popular votes, as are the vast majority of elections in free countries. The US national vote sum, and similar sums published in some parliamentary elections, are the rare exception where an official and incorrect tally gets called a popular vote.

Due to much controversy about this view, I wrote up a more detailed explanation of the difference.

The 1916 election

A century ago in 1916, women could not vote for President in most of the USA -- except for Illinois, which recognized women's right to vote in Presidential elections in 1913. President Wilson did not support suffrage in 1916 but his opponent, Hughes, did, and suffragettes campaigned for Hughes as a result.

Wilson won, but Hughes won Illinois handily, in fact his margin there of 202,000 votes was his highest in any state (and 2nd highest in the land) -- in part because the addition of women to the rolls meant Illinois had more voters than any other state. I have to speculate that this margin had to do with women voting for the candidate ready to defend their basic human rights.

Wilson won the college 277 to 254. And he won the so-called popular vote by 600,000 votes. But that "popular vote" in this case consisted of adding the popular vote from states like Illinois where women were human, and other states where they were less than human. Who can defend adding those totals together, cast under such different rules, and calling it "the popular vote" and declaring that Wilson "won" the popular vote in 1916.

Today, the difference between California and other states is not so dramatic as disenfranchising an entire sex. But because Californians are told their vote for President doesn't matter, the turnout there was 56% and an average of 65% in the swing states. If California had that average, that's 2.3 million more voters. Millions disenfranchised not because of their sex, but because the system says their vote doesn't matter. California's "popular vote" is a sham, and not too different a sham from that of men-only New York in 1916 or "Dear Leader of course" North Korea today. Oh sure, they have something they call the popular vote in North Korea, but the result is known in advance and nobody thinks their vote counts. (And yes, they know they could be punished if they put their ballot in the wrong box.)

You could not add the votes of Illinois and New York in 1916 and call it a true popular vote. You can't add the results of California's sham popular vote to Florida's real popular vote and call it a true popular vote. I mean, people do that, but they should not.

Can we figure it out?

All this said, you could attempt to measure what the vote would have been. We may not have enough data, but we could make some estimates. We know that Clinton led Trump by 3.5% in national polls before the election, but we also know that Trump outperformed those polls by 1.5-6% in many contested states. To really do this would require much more careful analysis than you see in this paragraph, which is written only to show one extreme of what's possible, and the difference is almost surely less than this from these two states. Full analysis would require looking at detailed voting and polling patterns and an understanding of what motivates people to stay home or vote differently in safe states. vs. swing states, and an understanding of how Trump outperformed his polls so broadly in the contested states. In the other direction, since the 8-9 million missing voters in the safe states are in states that swing Democratic, there are arguments Clinton's total could have been even higher. However, even with that analysis we still would not really know.

My intuition is that such a result would show Clinton scoring higher than Trump, but not by 3M votes. And the margin of error would include results where Trump wins that popular vote, but this would be the outside condition. Certainly the only hard data on states that were actually contested has him win if extrapolated, but the Democratic party dominance in the big uncontested states is very strong. Also not factored in this is the effect of voter suppression techniques.

I should note to non-regular readers that I am anti-Trump. At the same time, having been shocked several times by underestimating his support, I write this because this underestimation must stop, and both sides need to come to much better understanding of how people voted for or against them, and why.

Split totals

A slightly better approach would be to publish vote totals divided between swing and safe states. Because situations differ so much in the safe states, this is still not super accurate, but it's a lot better. (I built this from an earlier download so numbers may not match final totals exactly.)

               Clinton       Trump    Johnson       Stein  McMillin   Others
Swing Total 25,946,624  26,423,193  1,783,571     434,433   203,500  351,415
Safe Total  40,582,344  37,227,033  2,770,706   1,031,304   435,055  468,484

It is interesting to note how much better Stein did in Safe states, 130% better. Johnson did 50% better, Clinton 55% more and Trump 38% more

So what should the popular vote be?

One might argue that in an ideal democracy, the popular vote would represent the aggregate view of all voters. Some nations make voting mandatory in order to get this. Australia gets 95% turnout using this technique, but Malta, New Zealand and several other countries get turnout around 90% without legal compulsion.

It might even be argued that a truly ideal democracy would not only have everybody vote, but have everybody study the choices to make an informed vote. We don't get any of these ideals, and so in the USA it has come to be accepted that the popular vote is the vote totals from those who took the time to show up. The low turnout enables both voter suppression efforts and gives extreme value to successful "get out the vote" efforts, since it is far cheaper to convince a weak supporter to show up than to convince an undecided voter to swing your way.

Some election theorists have actually proposed that the best way to do elections would be to use a random sample, sometimes combined with strong incentives for members of this sample to vote, and possibly to also learn before voting. This seems strange to non-mathematicians but actually has strong validity. (In one variant, the selected electors are known weeks in advance and the campaigns and public interest groups focus their attention on "educating" them, in which case the number must be large so that truly personal targeting is not effective.) In a nation with 90% turnout these techniques make elections much cheaper but don't affect results much. In a country with 60% turnout which switches to 99% turnout from the randomly selected electors, the result becomes a much more accurate measure of voter will than the current system.

It is also worth noting that the entire popular vote system for President is not in the US constitution, and so alternate systems, including sampling, actually are legally possible if states willed it, though politically unlikely. There are many advantages to sampling: Close to 100% turnout, more informed voters, the possible reduction of massive campaign spending and fundraising and the elimination of voter suppression. Its main disadvantage is that it doesn't match non-mathematician's instincts about how an election should work, and the added risk of corruption of the random selection.

In order to get a real popular vote, even one where we total the will of the 60% who show up, it is necessary to get rid of the college. The college could be nullified by a pact between California, Texas and two other large size republican safe states. If just those 4 states agreed to cast all their electors according to a popular vote result, it would be sufficient to make the college match that popular vote. Once it was known that this was the case, all voters would now know their vote counted, and all candidates would campaign in all states instead of just swing states, and we would have a true popular vote result.


Yes, adding together all the votes from all the states is a bad sampling technique, because not all of the popular votes are the same.

Federal law requires that each state certify its popular vote count to the federal government (section 6 of Title 3 of the United States Code).

Under the current system, the electoral votes from all 50 states are comingled and simply added together, irrespective of the fact that the electoral-vote outcome from each state was affected by differences in state policies, including voter registration, ex-felon voting, hours of voting, amount and nature of advance voting, and voter identification requirements.

Each state conducts a real popular vote, and certifies it. The national "popular vote" has no official status, to the best of my knowledge it is simply added up by the press and published.

A lot of the confusion would be cleared up if the press instead reported two numbers, the "contested state vote total" and the "non-contested state vote total." However, it is sadly not quite that simple, both because you then need somebody to define what a contested state is, and you have the marginal contested states, where one candidate campaigned and the other did not. (In a few cases, you get strange things like Clinton campaigning in California, though I presume that was to raise money, not votes.)

"Some election theorists have actually proposed that the best way to do elections would be to use a random sample, sometimes combined with strong incentives for members of this sample to vote, and possibly to also learn before voting."

You probably know the tongue-in-cheek Asimov story which takes this to its logical conclusion.

But this thread here has taught me once again how the public is averse to the statistician's understanding of measurement and data. As such, it seems unlikely you could ever get to a random sampling election. One reason, of course, is it would probably favour the Democrats, and as such, Republicans would block it. The US census has wanted to use statistical techniques, but the constitution says "enumerate" and so they are blocked form doing so -- because the more accurate methods would count more poor people, and favour democrats.

The advantages presented for sampling are quite striking when you consider the flaws in the current system:

  • The selected electors would be much more likely to vote, especially if you added a legal requirement. If the electors were publicly listed, parties would of course do everything they legally could to get their supporters to vote as well. So close to 100% turnout. Could also give every elector the day off work or other incentives, or even just pay them all some large sum for doing their duty as electors.
  • The electors would probably take a special duty to get more informed. Read the platforms. Watch the debates. Receive special messages. This, however, would mean a pretty different result than the full electorate gives. It's a result that is very close to what you would have gotten if every voter paid close attention and was sure to vote.
  • Big public campaigns -- TV ads, giant rallies, mass mailings all become almost pointless if only 1 in 100 people is an elector. Super wasteful. Instead, free online paths to the electors would be available to candidates. That solves the problem of corruption and campaign finance.

Correct, we don't know what the national popular vote would look like and shouldn't compare what is currently being calculated to an electoral college win. Also, the electoral college vote itself is biased. Electors are being forced to vote in certain ways that meet various state laws. Colorado promulgated a rule on the day the electoral college met to allow it to replace an elector who voted for other than the only candidate on the ballot.

The plurality system of vote tabulation with its vote-for-one-only rule depresses the expression of support for candidates deemed not viable (all except the two major parties' candidates).

Approval Voting is the name of the simplest rule for voting that allows voters to express their intentions freely and designs the tabulation system to report those intentions. http://electology.org and http://av4co.org

Our vote collection and reporting system is wonderfully and disturbingly non-uniform. It consists of eligibility determination followed by ballot interpretation and then aggregation and reporting. This system functions sensibly (accurate outcomes) only when victory margins are large, but then voter apathy effects turnout as Brad points out. When margins are narrow, accuracy is affected by noise in eligibility determination particularly in remote voting such as in mail ballot states and by ballot interpretation error caused by inevitably inadequate human/machine interface in all districts except perhaps those still hand counting voter-marked paper with adequate oversight.

The electoral college beneficially allows focusing our attention on a few narrow margin states, such as, in 2016, WI, MI, and PA. Absent this focus it is inconceivable that we would learn how defective our election law and practice is or that we could achieve an accurate and decisive outcome.

If we can reform in a few individual states to get effective audits of all three - eligibility, tabulation, and reporting, based on documentary evidence, then and only then are we ready to attempt a national determination of eligibility and a nationwide tabulation system.

At present we are probably better off focusing attention during close elections on smaller entries than states - such as congressional districts. Each CD could elect an elector to vote in the electoral college. We could do recounts in individual CDs where margins are narrow.

The national popular vote is the addition of the popular votes of the 50 states and DC.

Under the current system, the electoral votes from all 50 states are comingled and simply added together, irrespective of the fact that the electoral-vote outcome from each state was affected by differences in state policies, including voter registration, ex-felon voting, hours of voting, amount and nature of advance voting, and voter identification requirements.

Federal law requires that each state certify its popular vote count to the federal government (section 6 of Title 3 of the United States Code).

Neither the current system nor the National Popular Vote compact permits any state to get involved in judging the election returns of other states. Federal law (the "safe harbor" provision in section 5 of title 3 of the United States Code) specifies that a state's "final determination" of its presidential election returns is "conclusive"(if done in a timely manner and in accordance with laws that existed prior to Election Day).

Remember again, that the national popular vote total has no official existence. (It would if there were a state compact as I discuss above.) At present it does not.

Perhaps I should sum it this way. Florida had a real popular vote to choose their electors. California did not. California's vote was more like the so-called popular vote elections held in countries like Iran or North Korea, where the result is predetermined, and voters don't care so much. It's not exactly like those -- in those places the voters are not free, and are scared of not voting as expected. But one thing is the same -- the voters don't think their vote can make a difference. In addition, the California ballot has many other real popular votes on it, which voters come to vote on, and then they might as well vote for President even if that one's a sham vote. Though oddly, voter turnout is much higher in Presidential years even in California.

Dividing more states’ electoral votes by congressional district winners would magnify the worst features of the Electoral College system.

If the district approach were used nationally, it would be less fair and less accurately reflect the will of the people than the current system. In 2004, Bush won 50.7% of the popular vote, but 59% of the districts. Although Bush lost the national popular vote in 2000, he won 55% of the country's congressional districts. In 2012, the Democratic candidate would have needed to win the national popular vote by more than 7 percentage points in order to win the barest majority of congressional districts. In 2014, Democrats would have needed to win the national popular vote by a margin of about nine percentage points in order to win a majority of districts.

Nationwide, there were only maybe 35 "battleground" districts that were expected to be competitive in the 2016 presidential election. With the present deplorable 48 state-level winner-take-all system, 38+ states (including California and Texas) are ignored in presidential elections; however, 98% of the nation's congressional districts would be ignored if a district-level winner-take-all system were used nationally

The district approach would not provide incentive for presidential candidates to poll, visit, advertise, and organize in a particular state or focus the candidates' attention to issues of concern to the state.

Awarding electoral votes by congressional district could result in no candidate winning the needed majority of electoral votes. That would throw the process into Congress to decide the election, regardless of the popular vote in any district or state or throughout the country.

Because there are generally more close votes on district levels than states as whole, district elections increase the opportunity for error. The larger the voting base, the less opportunity there is for an especially close vote.

Also, a second-place candidate could still win the White House without winning the national popular vote.

The National Popular Vote bill is a way to make every person's vote equal and matter to their candidate because it guarantees that the candidate who gets the most votes in all 50 states and DC becomes President.

I've said it before and I'll say it again. All of this is just doctoring around on the symptoms and not addressing the cause. Any system which claims to be democratic but is not should be seen in the same way as the various "democratic" republics in the (former) communist-bloc countries. If a party has x per cent of the vote, it gets x per cent of the seats in the legislature. If one has people (as opposed to parties) who are elected in single-person elections like the US presidency, anyone elected has to be supported by a majority. Anything else is bullshit.

There are two different things we can discuss. One is what form a democracy might take, which I discuss in my New Democracy topic. Great to discuss.

Then there's discussion of what we actually have and what might reasonably be changed, which goes here in the Politics topic, generally.

We can dream about entirely different alternatives, and its fun, and could happen for new institutions and nations, and for smaller ones. But for the USA there is only what can be done under its constitution and political system.

Unlikely, perhaps, but the constitution can be changed.

Yes, it can be changed. But to be changed, it sadly seems to need to be something that does not advantage one party over another in elections. They are that petty. It did not used to be so bad, but today it is.

If you propose something and it needs 2/3 of the chambers and 3/4 of the states, you can't get it if one party decides it will block them from office.

Now in the past, it's happened. Women got the vote, even though that changed the political landscape. But it took a lot of work, and the USA was late to the game. DC electors sway to the Democrats. Term limits don't obviously affect any party (and in fact are good for both, giving a chance for new blood.) Getting rid of poll taxes probably helps Democrats. But it was a long time ago now. I would like to hope it is still possible.

Can't say about Bush/Gore, but the common perception was that in that case Gore won the popular vote (of course again, we don't know) and so did Clinton, so that's two White House flips from Dem to GOP and they won't go for that.

What can happen is stuff at the state level, especially if it does not require too many states. A state could adopt preferential balloting on its own. (It is less certain they could adopt sampled voting.) 4 states could do a compact which would switch the USA to popular vote from the college. (And yes, it would then be a popular vote because people would know it was, and vote like it was.)

Add new comment