Okay. So earlier this week, Bernie Sanders won the Indiana primary.

Shouldn’t have been a big surprise, really. Indiana was an open primary, where independents are allowed to vote; independents tend to favor Sanders over Clinton, so Bernie’s generally been doing better in open-primary states. Clinton tends to run stronger among black voters, but Indiana doesn’t have a huge minority population, so that favored Bernie too.

So – not a surprise, right?

Except it was a surprise. Going into the election, all the polls said Hillary was going to win. A Marist College survey had Clinton up 50-46. An American Research Group survey had her up 51-43. YouGov had her up 49-44. Fox News had her up 46-42. Not one survey had Bernie ahead. The website FiveThirtyEight.com, which makes predictions based on all the polls put together, predicted Hillary would win by about a 52-45 margin.

But then Bernie won. Surprise!

Later, I got an email with an interesting question.

“Hi Aaron. I would be interested to hear…why the national pollsters keep forecasting Hillary to win, when she seems to often fall short. I know they talk about margin-of-error, but it seems to always go to Bernie.”

Good question, right?

After all, it has kinda seemed like that. All the national polls have shown Hillary in the lead from the beginning, but Bernie’s the one who’s been drawing the big crowds, he’s the one who’s got people talking on social media…and he keeps winning primaries! Indiana, the polls had Hillary up 7 points, Bernie won by 5. Michigan, the polls had Hillary up 21 points, Bernie won by 2.

And yes, polling isn’t perfect, but in this case it seems to be a one-way street. Bernie’s won states that the pollsters called for Hillary – but Hillary hasn’t won a single state that the pollsters called for Bernie.

What gives?

I asked Tom Jensen, the director of Public Policy Polling, one of the most respected polling outlets in the country.

“It’s sometimes just harder (for pollsters) to pick up those independent voters who are planning to vote in the Democratic primary,” he said. That may have been one reason for the incorrect polls in Indiana – compounded by the fact that Indiana has uniquely restrictive laws that make it nearly impossible for pollsters (including PPP) to conduct surveys there.

Okay. That makes sense.

But then Jensen added something else.

He said the pollsters have basically been getting it right.

Michigan and Indiana were outliers, he said, but the polls in every other primary have been fairly accurate. And while individual surveys may be off, they’re not all off in one particular direction. The polls underestimated Bernie in Michigan and Indiana’s open primaries, he said, but they also “slightly underestimated Hillary Clinton in states like Maryland and New York, in closed primaries where only Democrats could vote.”

Now, I trust Tom Jensen. Back in mid-March when Hillary was surging, he told me Bernie was about to have a run of victories. In early April, with Bernie on a roll, he told me Hillary was going to dominate for the rest of the month. He was right both times.

But is he right about the polls being accurate?

I searched the Internet for a page that compared survey data with actual results in every state. Surprisingly I came up empty.

Oh well. If it doesn’t exist, do it yourself!

In the chart below, you’ll see the polling numbers for Clinton and Sanders (and the predicted margin of victory), followed by their actual vote percentage in the primary itself (and the actual margin of victory). The column on the far right measures polling error, the difference between the predicted margin and the actual margin: who got overestimated, and by how much.

STATE CLINTON POLL SANDERS POLL DIFF CLINTON ACTUAL SANDERS ACTUAL DIFF POLL ERROR
AL 71.4 25.7 Clinton +45.7 77.8 19.2 Clinton +58.6 Sanders +12.9
AZ* 51.1 22.7 Clinton +28.4 57.6 39.9 Clinton +17.7 Clinton +10.7
AR 60.5 36 Clinton +24.5 66.3 29.7 Clinton +36.6 Sanders +12.1
CT 50.9 46.8 Clinton +4.1 51.8 46.4 Clinton +5.4 Sanders +1.3
FL 63.2 33.8 Clinton +29.4 64.4 33.3 Clinton +31.1 Sanders +1.7
GA 66.3 30.5 Clinton +35.8 71.3 28.2 Clinton +43.1 Sanders +7.3
IL 51.6 44.3 Clinton +7.3 50.5 48.7 Clinton +1.8 Clinton +5.5
IN 52.3 45.2 Clinton +7.1 47.5 52.5 Sanders +5.0 Clinton +12.1
IA 49.1 44.7 Clinton +4.4 49.9 49.6 Clinton +0.3 Clinton +4.1
LA 72.6 20.2 Clinton +52.4 71.1 23.2 Clinton +47.9 Clinton +4.5
MD 56.4 40.9 Clinton +15.5 63 33.2 Clinton +29.8 Sanders +14.3
MA 52.4 44.8 Clinton +7.6 50.1 48.7 Clinton +1.4 Clinton +6.2
MI 59.2 38.3 Clinton +20.9 48.3 49.8 Sanders +1.5 Clinton +22.6
MS 77 16.7 Clinton +60.3 82.6 16.5 Clinton +66.1 Sanders +5.8
MO 48.8 48.1 Clinton +0.7 49.6 49.4 Clinton +0.2 Clinton +0.5
NC 59.6 37.6 Clinton +22.2 54.6 40.8 Clinton +13.8 Clinton +8.4
NH 41.5 55.6 Sanders +14.1 38 60.4 Sanders +22.4 Clinton +8.3
NY 53.5 42 Clinton +13.5 58 42 Clinton +16 Sanders +2.5
NV 51.2 47.2 Clinton +4.0 52.6 47.3 Clinton +5.3 Sanders +1.3
OH 53.9 43.3 Clinton +10.6 56.5 42.7 Clinton +13.8 Sanders +3.2
OK 47.2 47.5 Sanders +0.3 41.5 51.9 Sanders +10.4 Clinton +10.1
PA 57.1 40.4 Clinton +16.7 55.6 43.6 Clinton +12 Clinton +4.7
RI 48.1 49.2 Sanders +1.1 3.6 54.6 Sanders +11 Clinton +9.9
SC 64.5 31.3 Clinton +33.2 73.5 26 Clinton +47.5 Sanders +14.3
TN 60.5 36.1 Clinton +24.4 66.1 32.4 Clinton +33.7 Sanders +9.3
TX 63.3 33.7 Clinton +29.6 65.2 33.2 Clinton +32 Sanders +2.4
UT* 43.8 51.1 Sanders +7.3 20.3 79.3 Sanders +59 Clinton +51.7
VT 10.2 87.4 Sanders +77.2 13.6 86.1 Sanders +72.5 Sanders +4.7
VA 60.2 36.7 Clinton +23.5 64.3 35.2 Clinton +29.1 Sanders +5.6
WI 47.4 50.1 Sanders +2.7 43.1 56.6 Sanders +13.5 Clinton +10.8

(For polling numbers, I used FiveThirtyEight’s “polls-only” projections for each state. In the two starred states – Arizona and Utah – FiveThirtyEight didn’t have enough surveys for a projection, so I used their weighted polling average instead. To make this easier, I’m only including the 50 states plus Washington, DC. Sorry, Guam.)

So how have the pollsters done?

Turns out Tom Jensen was right. Exactly right, in fact. In the 30 states on that chart, the polls have erred in Hillary’s favor 15 times, and Bernie’s favor 15 times. A perfectly even split.

Also worth noting: the pollsters don’t always predict the right margin of victory, but they have picked the correct winner in 28 out of 30 states. Michigan and Indiana were the only outliers.

How about accuracy? In most states, the FiveThirtyEight projections have been off by 4-9 points. The numbers were most accurate in Missouri, off by only half a point. Where were they most inaccurate?

UT     Clinton +51.7
MI     Clinton +22.6
MD    Sanders +14.3
SC      Sanders +14.3
AL      Sanders +12.9
AR     Sanders +12.1
IN      Clinton +12.1

The polls underestimated Sanders in Michigan and Indiana, but they also overestimated Sanders in a few states where Clinton’s margin of victory turned out to be even bigger than expected. (What the hell happened in Utah? The only poll FiveThirtyEight had to work with was from a local outfit that surveyed less than 200 voters with more than a week to go before the election. Utah is also a caucus state, and caucuses are notoriously difficult to predict.)

So there you go. Michigan and Indiana made headlines, and there have been some errors, but in general the polls haven’t skewed toward Hillary any more than they’ve skewed toward Bernie.

But…

…we’re not quite done yet. Something else is going on here.

Take a look at that chart again.

Where are all Bernie’s states?

Thirty states on that chart, and Sanders only won eight of them. You know he won a lot more than that. What happened to all the other states?

I did more digging…and the answer is fascinating.

As of today, May 7, there have already been primaries or caucuses in 41 states. FiveThirtyEight had pre-election projections for 30 of them. FiveThirtyEight couldn’t project the other 11 states – Alaska, Colorado, Delaware, Hawaii, Idaho, Kansas, Maine, Minnesota, Nebraska, Washington, and Wyoming – because there wasn’t any survey data for those states.

Nobody polled them.

Why not?

Delaware, apparently, got passed over because nobody lives there. Sorry, Delaware.

But the other 10 states?

They’re all caucus states, as it turns out. That’s why nobody polled them.

Caucuses are weird beasts. Different states run them in different ways, but the upshot is that you don’t go to the polls and cast a ballot – you go to a meeting, organize into groups, and spend time talking to people. The process can take several hours. So it’s almost impossible to predict, with any accuracy, who’s actually going to show up for these things. Maybe the babysitter cancels at the last minute. Maybe it’s your anniversary. Maybe you don’t love your candidate enough to blow a whole evening on them. Could be anything. And in order for a poll to be accurate, the pollster has to be able to predict who’s going to show up: this percentage of white voters, that percentage of women. You can’t do that with caucuses, so pollsters rarely bother to try.

Thirteen states have held caucuses so far this year, and only a couple of them have been polled. Iowa always gets surveyed because it’s first in the cycle; Nevada got surveyed a few times; Utah had that one poll that turned out to be totally wrong – and that was it. The other ten states – Alaska, Colorado, Hawaii, Idaho, Kansas, Maine, Minnesota, Nebraska, Washington, and Wyoming – FiveThirtyEight didn’t even try to project, because they had no polls to go on.

And Bernie won all ten of those states.

Weird, right?

So there’s your answer, if you’re wondering why the polls keep saying Hillary while the results keep saying Bernie. The polls themselves have actually been right, for the most part, and their errors have been perfectly balanced – it’s just that Bernie’s victories have been coming in those caucus states, which nobody bothers to poll.

Which only leaves one more question:

Why does Bernie do so much better in caucuses?

It’s a pretty stark difference, actually. Bernie has won 11 of the 13 caucuses so far, while Hillary has won 21 of the 28 primaries. Something’s clearly going on.

There are a variety of possible reasons. First off, almost all the caucuses have taken place in states with very small minority populations, where Bernie has an advantage anyway. You also have to be really energized to spend an entire evening at a caucus, and we know that Bernie supporters are generally more energized than Hillary supporters. (You also have to have the whole evening free, so conservatives, here’s your chance to joke about Bernie voters all being unemployed.)

But it may not be a Bernie thing at all. I don’t know why, but for whatever reason – and this goes all the way back to 2008 – Hillary Clinton just sucks at caucuses.

Want to know why Barack Obama has been president for the last eight years? Here’s why. In 2008, the Democratic Party held 38 presidential primaries, and Hillary won 20 of them. She won pivotal early primaries in New Hampshire and Florida; she won the big races in California, New Jersey, New York, Ohio, Pennsylvania and Texas. Obama won his home-state primary in Illinois, but his second-biggest victory (in terms of population) was North Carolina. Hillary got all the big prizes. And she got more votes overall too: according to USElectionAtlas.com, Clinton won a total of 18,055,516 primary votes nationwide, against only 17,628,560 for Obama.

So why did Hillary Clinton lose?

Because there were also 14 caucuses, and she won…one of them.

PRIMARIES WON BY CLINTON PRIMARIES WON BY OBAMA CAUCUSES WON BY CLINTON CAUCUSES WON BY OBAMA
Arizona Alabama Nevada Alaska
Arkansas Connecticut   Colorado
California District of Columbia   Hawaii
Florida Delaware   Idaho
Indiana Georgia   Iowa
Kentucky Illinois   Kansas
Massachusetts Louisiana   Maine
Michigan Maryland   Minnesota
New Hampshire Mississippi   Nebraska
New Jersey Missouri   North Dakota
New Mexico Montana   Texas
New York North Carolina   Washington
Ohio Oregon   Wyoming
Oklahoma South Carolina    
Pennsylvania Utah    
Rhode Island Vermont    
South Dakota Virginia    
Tennessee Wisconsin    
Texas      
West Virginia      

 

Texas is the textbook case here. (I’ll let you make your own joke about Texas and textbooks.) The state held a primary on March 4, 2008, which Clinton won by a 51-47 margin – but the state also apportioned some of its delegates at precinct-level caucuses, and Obama won those by a margin of 56-44.

Why is Hillary Clinton bad at caucuses? I have no idea. But it probably cost her the 2008 nomination – and it’s making the 2016 race a lot closer than it otherwise might be.

——————————————

So!

After all that, here’s what we know:

The polls did get Michigan and Indiana wrong, but otherwise they’ve correctly called the winner of every other state. Primaries aren’t easy to predict, so the pollsters are having a pretty decent year. Though it might sometimes appear otherwise, the polls are not skewed in favor of either candidate.

Most of Bernie’s victories have come in caucus states, which pollsters typically don’t survey. Why Bernie does better in caucus states is still an open question, but caucuses were also Hillary’s albatross in 2008 as well. (The only exception seems to be Nevada, which Hillary won in both 2008 and 2016. I don’t know who runs her Nevada campaign, but that person deserves a big raise.)

And we know enough about this primary to make an educated guess about how the rest of the cycle will go. Tom Jensen says there are really only three things you need to know: “Bernie Sanders does a lot better in caucuses than he does in primaries…he does a lot better in open contests where independents are allowed to vote…and he does a lot better in states that are heavily white.”

There are ten contests left, and only one of them is a caucus – so if you’re still holding out the hope for Bernie, you need to pray he figures out primaries quick.