How should we handicap the political polls?
As I have written before, the US Presidential election will be decided solely by the voters in a small number of states. Florida and Pennsylvania are the most important, and Arizona, Michigan, North Carolina and Wisconsin also play a role. In particular it will be decided there by the "weakly faithful" voters, the ones who don't vote reliably for their party, and in particular who sometimes don't vote at all. The party faithful, the "base," are already in the bag and these voters are all that counts.
(Sadly, the primary process is designed to choose the nominee based on the views of party faithful voters, mostly in safe states, with particular emphasis on Iowa and New Hampshire, but this is hard to fix.)
As such, national polls are dangerous and misleading. A lot of Democrats regularly take solace in the fact that Clinton won the press-tabulated national vote in 2016, because they wish that were the real election. It is not, and that wishful thinking should not let people get distracted by doing national polls. Because you have to poll 6 times as many people to poll the swing states with a large enough sample, we see a lot of these misleading national polls.
There are only a few swing state polls, and we should see more. They usually poll either registered voters or likely voters, and they don't put enough focus on the core question of "If the race is X vs Y, will you vote at all?" because people have a hard time answering that question. We're like the physicist in the joke searching for his keys under the street-lamp even though he lost them 100 feet away.
But with the numbers we get, which can be seen by plugging in the key states at sites like FiveThirtyEight Florida there is still a great deal of variation. The bad news though is what's not in these results. It's more than just the lack of data on these intermittent voters and whether they will show up on election day. We also should handicap based on a variety of other factors.
Note that many of these factors will, over time, be incorporated into the polling. The effect of propaganda, negative campaigns and dark money campaigns from the rich will affect voter viewpoints and how they respond to polls. As one gets closer to the election, those handicap factors must be adjusted. Some factors, like voter suppression, poll bias and October surprises may only be learned on or close to election day.
Normally, when we look at polling error, we primarily consider sampling bias and ordinary statistical error, which I will discuss briefly at the bottom. These factors are things which will affect the result, but which the people polled are not yet aware of.
Russian interference (+3 Trump)
Russia, and possibly other foreign powers, will be manipulating the elections. The only question is how successful they will be and if, as in 2016, they will favour Trump again. Evidence is strong the manipulation was in favour of Trump. Some of those manipulations are present already in the polled results, but the bulk is yet to come, particularly after the Democratic nominee is selected. I give them decent odds of swinging about 3 points to Trump.
Domestic interference and voter suppression (+2 Trump)
Politicians seem far too ready to do any trick they can to win at elections, but the Republicans seem much more willing these days to do both the legal but unethical and the downright illegal. Expect to see strong efforts of voter suppression, largely by the GOP, against leftish voting groups in the key states.
Swing from the actual heavily negative campaign (+2 Trump)
Expect a negative campaign. The problem is that Trump has barely begun his attacks on the Democratic final nominee. Those will begin in earnest once that nominee becomes more solid. Trump will use his combination of attacks, insulting nicknames, invented scandals, exaggerated scandals and outright lies in many cases to slam his opponent, of that we can be sure. His main focus will be on the weak right-wing voters, who don't much like Trump -- he wants to scare them into thinking they need to vote Trump anyway to avoid the greater horrors of his opponent. And he will come up with horrors, at least as far as those voters are concerned. He may also find things to scare weak Democrats away from voting, as he did with Clinton.
On the other hand, the Democrats have been in full attack mode on Trump for 3 years, and are doing so heavily in the campaign. They frankly don't have a lot of new negatives to spring, though they may get lucky -- Trump keeps providing them, but he is perhaps the most Teflon politician in history. As such, dislike of Trump is already encoded in these poll numbers and may not be able to move much.
Increased recognition for the Democratic nominee (+2 for newcomer Dems, +1 for Biden/Sanders)
The present poll numbers also reflect less awareness of the Democratic contenders. This awareness will be boosted by the major ad campaigns and the positive ads from the nominee. Awareness of Trump won't change. Biden and Sanders have the most existing awareness. Klobuchar, Warren, Bloomberg and Buttigieg have more ground to gain. The others, should they win, have even more to gain.
Opposition from rich enemies (-2 Dem, -4 Warren/Sanders)
The Democrats, particularly some of the candidates, have decided to make big election issues about the power of the 1%, billionaires, the abuse of big pharma and big medicine, and tech companies like Facebook. If they make these powerful enemies -- and the 1% are of course ridiculously powerful -- some of them will shift their election efforts away from the Democrats. Most will stay the same, but a shift of even a small portion of the 1% or other major corporate/money powers can be dramatic.
Trump also has made enemies, but more rarely makes rich enemies.
Larger pool of weak supporters (+2 Dem)
The Democrats have a much larger pool of voters to win. Far more non-voters are Democrats than Republican. As such, if the parties do equally well at GOTV, it's a gain for the Democrats. If they can figure a way to truly get out those weak voters that they have never done before, this could be a much larger factor.
Ground game (Even)
Each side will play their ground game to get out the vote. It is not immediately clear which side will do better.
Polling error (+1 Trump)
It has been suggested that polls under-report support for Trump, or at least did so in 2016. Trump's victory was within the margin of error from the polls in 2016, but when it swung, it swing Trump's way far more often than Clinton's, suggesting some possible systemic underreporting. It was speculated this was because people didn't want to admit, even to a pollster, that they supported Trump. It is unclear if this will happen this time, but I rate it as having a potential mild effect.
October surprises (Even)
We don't know what the surprises will be, but once you see them, you may adjust your handicapping.
Green factor (?)
The Green party may decide to run only in safe states, though they strongly resist the idea. This is hard to predict. Until this is done, and even after, Green voters will tell pollsters they support the Green nominee.
Economic downturn (Unknown: -4 Trump, or 0 if it doesn't happen)
It's very hard to predict, but if there's a major economic downturn, this will hurt Trump. But we really have little idea if it will happen before the election. Current polls, however, are based on belief in continued good economic conditions.
Total: +4-6 Trump
My estimates above for the handicaps are largely just guesses. You may have your own estimates of how much to handicap based on each factor.
If you concur with the above, it suggests that unless you see the Democrat as 4-5 points above Trump in the polls, they are actually in trouble. Dems should only be comfortable with leads of 6 points or more. Which they don't generally have at present.
What's already in the numbers
These factors above are things that I believe are not yet incorporated in poll numbers. Other things are already there, including:
- Current economic prosperity
- Trump's flaws and errors
- Small amounts of Russian interference
- Recognition among highly engaged voters of the lesser known Democrats
- Burisma scandal, Biden gaffes, Warren Indian claim, Buttigieg's sexual orientation and other well publicised issues which may lose some voters.
- Sanders calling himself a socialist
Other polling error
Normally, when people evaluate polls, they don't consider factors like the ones above. Polls are also widely wrong because of both bad luck and bad polling methodology.
The bad methodology is what we call sampling bias. All polls try to estimate what a large population thinks by talking to what they hope is a representative sample. Pollsters work hard to control for these biases, but they often fail. You hope that just picking people at random will remove many of the biases but that's a vain hope.
One huge problem is a lot of polls are done by telephone. According to Pew research, only 6% of people are willing to participate in phone polls. You read that right -- they try 16 numbers for every person who agrees to do a phone poll! I never do phone polls, so the views of those like me are never incorporated in the polls. The problem is that propensity to answer the phone and do polls might be related to the thing you're measuring. It can also be affected by things like when you call and which phone number the poll has for the person. The ratios for phone polling are now so poor that one wonders if it isn't an obsolete method, and indeed many pollsters are using internet methods as well. Many polls are also fairly long -- they want to ask you a bunch of questions, in part to figure out if they are sampling correctly -- and that scares away people who are busy.
After all this is one, you get the factor pollsters will be open about -- bad luck. Even if you sample without bias, there is still a random chance your sample will not match the general population. The larger your sample, the less chance of this. The math of this is well studied, and professional pollsters will all publish it. The mistake is they may often publish only this, and tell you, "This poll should be accurate within 2% 99% of the time" when they really mean it should be that accurate if they sampled without bias. Which they didn't. Usually those biases will dwarf the error of the statistical methods.
All of this makes measuring extremely hard, and often misleading.