Polls are meaningless this time of year

Many years ago when I was working with the Jimmy Carter campaign, a veteran Democratic campaign consultant told me to never pay attention to the polls until after Labor Day. That has remained sound advice over the years.

Cartoon_53Polls are easily manipulated to produce the desired result by the wording of the question, the demographics of the polling sample being contacted, and even the method of polling.

The media knows that people rarely read beyond the headline with the top-line poll results, and only political nerds want to get down into the “weeds” of the poll, which is not always made available by pollsters.

The media loves to waste its time and energy in endless speculation about polls as part of its “horse race” coverage, rather than doing actual reporting about the character of the candidates and the nuts and bolts of their policies.

Some are more nefarious, particularly campaigns,  producing what I refer to as “narrative polls,” polls designed to produce a desired result that conforms to a predetermined media narrative (“we’re winning!”) This also plays a role in the “horse race” coverage of the media, since the media will report any poll without regard for the pollster’s reputation and track record, or alignment with a particular candidate.

Polls taken around this time of year have almost no predictive value of the outcome of an election in November. Princeton elections guru Sam Wang recently wrote, February national polls are the best you get until August:

Some media types are going around with their hair on fire over two unfavorable polls for Hillary Clinton in which she lags Donald Trump. In response in the NYT, Norm Ornstein and Alan Abramowitz are trying to convince youthat these polls mean nothing. Stop the Polling Insanity. Nothing, I tell you! Don’t Panic!!!

In a deep sense, they’re right. As I wrote the other day, opinion can move a lot between now and Election Day. And it is inappropriate to trumpet a single poll showing an exceptional result, which is what the news channels do.

However, do not throw out the baby with the bathwater. In fact, we can learn quite a lot from polls by extracting as much value as possible from them. This can be tricky because right around now, national polls are the least informative they are going to be in 2016. To put it another way, polls will be more informative one month from now – and they were also more informative a month ago. How can this be, and what do we really know about the Clinton/Trump November win probability?

Elections scholar Christopher Wlezien very kindly sent me the data that he and Robert Erikson used to construct the graphs in The Timeline Of Presidential Elections: 1952-2008. Adding in 2012 data, I took time series from 16 Presidential campaigns and calculated the standard deviation of the total movement as a function of time. This is a measure of uncertainty about November based on polls for a given day. This graph shows the ±1 standard deviation interval in red:

StandardDeviation

(Note that in my previous post I plotted the standard deviation in the Democratic vote share. However, the appropriate standard deviation to use is the standard deviation of the Democratic-Republican margin, which is twice as large. This is why I had to revise the win probability. PEC regrets the error.)

This year, January 1st was 312 days before the election. At earlier dates, the standard deviation is between 14 and 22 percentage points. You can see the variation across 16 Presidential campaigns in the gray traces. So polls before the new year really are quite uninformative.

Now look at later dates: the gray curves converge. Consequently, the standard deviation declines, and reaches a local minimum at 270 days before the election, in mid-February- close to the start of primary season. So before the primaries start, February is a time when national polls tell us a fair amount about the final outcome.

But wait! After that, the standard deviation creeps upward. The election is 169 days from now, and in about a week the standard deviation hits its maximum value for 2016. Truly, now is the single worst time to be paying attention to fresh polling data. I don’t know why this is. It could be because typically, one or both parties are still going through an active nomination contest – as Hillary Clinton and Bernie Sanders are doing now.

Amusingly, national polls won’t reach their February levels of accuracy until August. [Which will be after both parties hold their conventions in late July this year.] The Clinton-Trump margin in February was Clinton +5.0%. So how about if we just use that until after the conventions. Can you wait?

No? Okay, let’s do something else. There are currently  88 national polls for 2016. We can weight these to create the best possible estimate for the November Clinton-Trump margin. For the weight, use 1/sigma for the corresponding date on the graph above. For independent observations, this weighted sum is optimal. Applied to past elections, it favors the November winner in in 14 out of 16 elections (missing Reagan in 1980 and Bush in 2000), an accuracy rate of 87.5%. This year, it gives us a weighted-average margin of Clinton +6.5%.

In short, we have a situation in which today’s snapshot (Clinton +2.7%) shows a close race with a definitive Clinton lead (93% probability according to HuffPollster), and the November outlook shows a larger average expected lead (Clinton +6.5%), but a lower win probability* of 70% – the same as what I wrote the other day.

Here are some caveats and consequences that come to mind:

1) My analysis today implies that the current movement in polls is transient. If uncertainty is larger now, this suggests that there is some natural set point for the Clinton-Trump contest – one where we had a clearer picture a few months ago than we would by watching today’s news.

My general sense of the current state of the race is that Democrats are still in the midst of their nomination process, while Republicans are coming together around their nominee. Either of these dynamics would be enough for polls to become less accurate – and to favor the candidate whose nomination is settled. If true, then we might expect numbers to move back toward Clinton after the June 7th primaries. Also possible, though less likely, is continued movement toward Trump.

2) It seems to me that during periods of increasing uncertainty, it is best to incorporate older polls, on the grounds that these data points add information and decrease uncertainty. Conversely, starting at 160 days before the election (early June), I should switch to a rolling time window, since at this point polls are becoming increasingly predictive.

3) Now is a time to pay attention to non-poll-based methods. As longtime readers know, I am generally against mixing up polls and “fundamentals”-based models. But it is a good time to consider the possibility of looking at them.

However, there are surprisingly few models worth looking at. Models are subject to conceptual and technical errors. And very few fundamentals-based models have well-understood error properties. In an exception, Lauderdale and Linzer did a particularly good job in 2012. At that time, they estimated that national vote share in their model had a 95% confidence interval of +/-7% at the national level. In the units I plotted in the red curve above (+/- 1 sigma in two-candidate margin), this probably corresponds to about +/-7%. If true, that approach would be better than polls from now through August. However, to my knowledge, Linzer (who now does analysis at Daily Kos Elections) has not come out with a public calculation this year. And so I wait.

*To calculate a probability, note that the weighted-average value of sigma during the time period of January 1 to now is 11.1%. The probability is calculated in MATLAB as prob=tcdf(clinton_trump_margin/11.1,3). In Excel: =1-TDIST(clinton_trump_margin/11.1,3,1).

OK, that’s the science of polling. Now what do the Beltway media villager “analysts” have to say? Greg Sargent of the Washington Post reports today, Clinton’s lead over Trump may be bigger than you think:

But what if Hillary Clinton’s national advantage over Trump is actually larger than it appears? And, more to the point, what if the reason for this is a thoroughly conventional one?

NBC’s Chuck Todd and Dante Chinni have served up a useful analysis of the current national polls that suggests this is a very real possibility. They looked at three recent polls that currently show the race very close: The NBC News poll showing Clinton up 46-43 among registered voters; the New York Times/CBS poll showing her up 47-41; and the Fox News poll putting Trump up 45-42.

But then Todd and Chinni took into account the fact that a sizable chunk of people supporting Sanders are now saying they cannot back Clinton. These are the “Sanders-only voters.” They took the additional step of assuming that Clinton wins back 70 percent of those voters. Here’s what happens to the national numbers:

In the NBC/WSJ poll, Clinton’s advantage over Trump goes from three points to eight points and she leads 51 percent to 43 percent….

In the latest CBS News/New York Times poll, Clinton’s advantage grows from six points to nine points with 70 percent of Sanders-only voters — she leads 50 percent to 41 percent. In the latest Fox News poll, where Trump currently leads Clinton, the Sanders-only voters make it a tied race — 45 percent to 45 percent.

Now, in my view, we shouldn’t place too much stock in national polling at this point, because it historically has not been predictive. But if we are going to obsess over it, let’s keep this in mind: In two of these polls, once you allow for the possibility that Clinton could win over many of Sanders’s supporters once he concedes and endorses her, Clinton holds sizable national leads, of eight and nine points. Nate Cohn has similarly concluded that, if Clinton can consolidate Sanders supporters behind her, she could gain a “considerable advantage” against Trump.

And we’ve seen this before: As Todd notes in his video presentation of these numbers, in 2008, Barack Obama picked up three points against John McCain in NBC polling after Clinton surrendered in the primaries.

If this is right, the point is that the tightening in the polls between Clinton and Trump — which is real — may reflect a particular moment in this race that may prove fleeting, in ways we’ve seen in the past. To be sure, Democrats should not underestimate Trump or imagine that defeating him will be easy. They should work to determine the true source of his appeal, i.e., his suggestion that our political and economic system is failing people and he’d snap it over his knee and get it working again. They should work on making an affirmative case for Clinton that addresses this voter dissatisfaction in addition to relying on the low hanging fruit of attacking his business past and highlighting his wretched comments. Nor does any of this mean that Clinton’s high negatives aren’t a real problem. Democrats should obviously be prepared for any manner of attack that Trump will throw at her, and they’ll need to figure out how to create a more positive narrative around her.

Rather, the point is that we should stop over-inflating impressions of Trump’s strength. We should stop ascribing magical political powers to Trump based on the questionable notion that his “unconventional” and “unpredictable” campaign makes him a more formidable foe than anyone expected. Trump will be difficult to beat, but that might be mainly because these elections are always hard. It is perfectly plausible that the “old rules” will end up applying to some degree. For instance, Clinton may be able to beat Trump, at least in part, by offering up more convincing policies and revealing his to be the nonsense that they are. Maybe assuming that Trump has rendered policy debates meaningless actually gives him too much credit. Maybe we shouldn’t accept Trump’s boasts of super-human appeal in the Rust Belt at face value: they may well run headlong into demographic realities. Meanwhile, we should keep focused on what the aggregate data is actually telling us.

One other point: The Todd/Chinni analysis could have important implications for the endgame of the Dem primaries. Once the voting is over in June, Sanders will have nothing left to do but win actual concessions in exchange for working to swing his supporters behind Clinton. You could see a real shift in how this race is covered, with more and more analysts — and high profile party leaders, such as Elizabeth Warren, and, yes, Barack Obama — pointing out that the failure to unite Democrats is making the prospect of a Trump presidency more likely. That could make it harder for Sanders to hold out. We don’t know if Sanders’s supporters will get behind Clinton in the numbers she needs, and she will have to do her part to make that happen. But despite all the tensions, Sanders, too, will probably end up doing all he can to ensure that it does.

UPDATE: Charlie Cook at the National Journal weighs in. The Trump-Clinton Race Is Not As Close As It Looks: “We are be­gin­ning to fo­cus on a Novem­ber elect­or­ate that is broad­er, more di­verse, and con­sid­er­ably more mod­er­ate, in both ideo­logy and tem­pera­ment, than the one that se­lec­ted Don­ald Trump. Chances are high that these voters will be­have much dif­fer­ently than the ones in the GOP primar­ies.”

One response to “Polls are meaningless this time of year

  1. captain*arizona

    only when the candidates are not well known. only gary johnson and jill stein and what ever goofball the noo-con artists they con into running. too bad it won’t be bill kristol. his press conference would be interesting!