Sunday, January 15, 2017

BTRTN 2016 Election Day Forecasts: How We Did and What Went Wrong

We’ll start with the big picture and then work down to the nuts and bolts.  Of course, we were wrong on the most important call of all.  Donald Trump overcame the odds (more on those odds later) and won the 2016 Presidential election by a 306-232 electoral vote margin over Hillary Clinton.  And we also were wrong in the Senate, which we said the Dems would retake by a 50/50 margin (and the assumption of a Clinton victory to break the tie).  We did better with respect to the House – much better, in fact.  We said that the Dems would pick up five seats and they picked up six.  (Most forecasters had them picking up 10+ seats.) But we did relatively poorly with the Governors, calling only 8 of 12 races correctly, as we were victimized by the deadly combination of a number of tight races and little polling.

All of this sounds pretty bad.  We were hardly alone in this, as many kind readers have pointed out.  But we were off where it mattered most.

When you look at the overall record (the chart below), it does not seem quite as bad.  We forecast a total of 537 separate races, and managed to get 519 of them right, a 96.6% hit rate.  This is actually a bit better than the overall track record we had compiled from 2008 to 2014, when we were right on 96.3% of the 1,530 races in the timespan.

Right
Wrong
% Right
President*
51
5
91%
Senate
32
2
94%
House
428
7
98%
Governors
8
4
67%
TOTAL
519
18
97%
* The 56 Presidential races consisted of the 50 states, the District of Columbia, 3 districts in Nebraska and 2 in Maine

Now this seemingly laudable scorecard obviously reflects the single party dominance of most of our states and congressional districts, resulting in many, many barely contested races.  A better measure of how we did is to look at the results in the races that were reasonably close (“in play”).  Only 69 out of the 537 races were decided by a margin of 10 points or less (that is, 55%/45% or closer) and among those, we were right 74% of the time, which we consider to be quite good, certainly well above the 50% that a coin toss would have afforded. (Just for the record, there were no surprises among the 468 races that were won by greater than 10 points – they all went as expected, every last one of them.)

Close Races
Right
Wrong
% Right
President*
11
5
72%
Senate
7
2
78%
House
28
7
80%
Governors
5
4
56%
TOTAL
51
18
74%

Of course, being wrong on those five Presidential state races was, and should be, the headline.  As everyone knows, if Hillary had won three of those races – the three she lost by less than a percentage point, Pennsylvania, Michigan and Wisconsin – she would be preparing her Inauguration speech right now.

Let’s look at each set of results.

President

The big forecasting/polling subject after the election was not between “right versus wrong” but rather a full discussion of “probabilities.”  Some aggregators were quite clear that Trump had a material chance to win – Nate Silver calculated the odds of a Clinton win at 71%/29%, and he took great pains to describe that those kind of odds meant there was certainly a credible path to a Trump win.  For sports fans, the Chicago Cubs faced roughly those same odds of winning the World Series after Game 5 when they were down 3 games to 2.  They, of course, overcame those odds by winning the last two games of the Series.  The Cleveland Cavaliers faced an identical situation in the NBA Finals, and also won the NBA Championship after being down 3-2 to Golden State.  From that perspective, Trump’s win was only as unlikely as those of the Cubs and the Cavs.

Other aggregators were far more definitive than Nate.  The New York Times Upshot had the Clinton win odds at 85%/15%.  The Huffington Post had their odds at a whopping 98% for a Clinton win.  We at BTRTN do not calculate probabilities, believing that they are confusing and subject to misinterpretation.  But when readers asked me privately just before Election Day what I thought the odds were, and I answered “80/20.”  But the words I used in our final forecast made it seem as if our view was HuffPo-esque in claiming that Clinton had virtually no chance of losing, and that was a mistake.

At the national level, of course, the polls were correct.  Hillary “won.”  We at BTRTN (and I use us as a proxy for all pollsters and aggregators) came pretty darn close to being dead on, as the chart below demonstrates, missing by a mere 0.4 percentage points.  Frankly, you can’t expect to do much better than that.  (Few remember that the polls showed Obama winning by less than a point in 2012, and he actually beat Romney nationally by 4 points; but no one kicked up a fuss then for the “miss”.)

President
BTRTN Forecast
Actual Results
Diff
National
2.5%
2.1%
-0.4%

The problem, of course, is that presidential elections, alone among all of our national elections, are not decided by popular vote.  I find it ironic that ALL of the 537 races I called on Election Day were decided by the popular vote.  Every election except the Presidential race, which is decided instead by 56 individual state and district races (the District of Columbia, three districts in Nebraska and two in New Hampshire) whose results in turn dictate the outcome of the anachronistic Electoral College.

Thus forecasters must get deeply into state-by-state polling, which is fraught with peril relative to the national polls.   There is simply less polling at the state level, and it is conducted by a varied set of players (including local news organizations and partisan pollsters) who may or may not know completely what they are doing when it comes to defining a proper universe, sampling, projecting likely turnout among voter segments and all the other nuances of polling.  All we aggregators can do is weed out the worst of them (subjectively), and then hold our noses and hope that when we add all the polls up and average them out (by whatever method we use), we have eliminated most of the noise.

We were wrong on five states:  Michigan, Wisconsin, Pennsylvania, Florida and North Carolina.  Trump won the first three of those states by margins of less than 1 point; he won Florida by 1.2 points; and North Carolina by 3.6 points.  It has been widely documented that if a mere 38,873 voters (out of 13 million) in Michigan, Wisconsin and Pennsylvania had gone for Clinton instead of Trump, she would have squeaked by.  But no, in America we do not practice “one person, one vote” and thus Clinton’s 2.9 million national vote “victory” was thwarted.

Of all of the outrageous statements – and outright lies – that Donald Trump has made, perhaps the most egregious (though not the most offensive) has been to claim that his victory was a “landslide.’’  Nothing could be further from the truth.  Nate Silver did the research and found that Trump’s win was the 44th biggest (out of 56) largest Electoral College win ever.  (Said another way, it was the 12th closest race ever.)  And, of course, this was the fourth time the winner of the popular vote did not become President; the fact that Trump lost the popular vote makes the landslide claim utterly ludicrous.

The big question:  why were the state polls off and where did the forecasts go wrong?  Big questions often have as their answer, “all of the above.”  In that spirit, we offer three general answers:  1) methodology issues, 2) infrequency of swing state polling down the stretch, and 3) the fluidity of the race itself, embodied by many October surprises down the stretch.  Let’s take each in turn.

Methodology.  I do not have a definitive answer on this point, but I’m sure there had to be some methodology issues with respect to the polling.  The American Association of Public Opinion Research is doing its own rigorous post-mortem and they will have their report in May.  Doubtless some of the issues they are looking into will include: 

  • Whether proper sampling methods were undertaken; the problems with identifying “likely voters”
  • The difficulty in projecting turn-out by sub-segments (you may recall that in 2012, Mitt Romney’s pollsters told him he was going to win because they were convinced turnout would favor him and adjusted their polling results accordingly
  • Whether, in particular, white working-class voters were underrepresented in polls
  • The so-called “Shy Trumper” theory, in which people were embarrassed to admit to pollsters that they would vote for Trump.  (This theory is related to the 1992 California Governor race that African-American LA Mayor Tom Bradley lost after the polls showed him ahead – the theory was that people did not want to admit they were actually going to vote against an African-American.
  •  And so on… 

Again, many state polls are conducted by less than world-class pollsters; these type of issues vex even the best in breed, and are certainly likely to be exacerbated by lesser pollsters.

Infrequency of Polling in Swing States.  I mentioned that there was not really enough polling – good or bad -- at the state level, particularly in those waning days.  Minnesota – where we had Clinton winning by 9 points, and she actually won by only 2 – ran its last poll with a field date ending on October 30.  The notorious Wisconsin “polling problem” had a similar situation, with no polling after November 2.  And even those who had polls close to the end had poll field dates that ended on November 4th, 5th or 6th.

The Fluidity of the Race Itself.  I think one can fairly conclude that this race was simply too fluid to properly track down the wire absent hour-by-hour tracking.  Consider all of the major “events” (inclusive of October surprises) that affected the election from June 8th (the end of the primary season) on -- you can see from the chart below the clear impact each had on the polls, with Clinton’s lead gyrating between “enormous” and “getting too close for comfort.”  (And this excludes the steady stream of John Podesta emails that were being released, courtesy of WikiLeaks and, apparently, Vladimir Putin, over this same timeframe.)

Poll
Clinton
Trump
Spread
6/8 to 7/5: Post-Primary Period
44
38
6
7/6 to 7/21: Post-Comey Announcement (7/5)
45
41
3
7/28 to 9/9: Post-Convention: Kahn Family Flap
47
41
6
9/10 to 9/26: Post-Deplorables/Pneumonia Flap (9/9, 9/11)
46
43
3
9/26 to 10/7:  Post Debate #1 (9/26)
48
43
5
10/7 to 10/28:  Post Trump Sex Talk Tape (10/7)
49
42
7
10/29 to 11/3:  Post First Comey Letter (10/28)
48
45
3
11/3 to 11/7:  Final Days, Including Second Comey Letter (11/6)
47
44
3

For Hillary Clinton, it is plain to see that she had one major enemy in her quest -- James Comey, the FBI Director, who thwarted her at every turn.  Clinton won the convention battle (Trump sabotaged his own convention with his inane fight with the Kahn family, while Clinton shone amidst an array of A List supporters including her husband and the Obamas); she won each of the three debates handily; and she largely handled each of Trump’s gaffes well, by standing out of the way and letting the new cycle give them their play.  She had only two self-inflicted wounds, the “basket of deplorables” reference to Trump supporters, which was closely followed by the pneumonia “cover-up.”

But the first Comey letter on October 28 was devastating.  Her lead, buoyed by the Trump Sex Talk Tape incident, climbed to over 7 points in aggregated polls and to double digits in some.  The first Comey letter drove her lead under five points as we headed into November, and it then trended down to three points, where it appeared to stabilize.  And then came the second Comey letter, released on November 6.

On the face of it, that letter was good news. Comey exonerated her, reinforcing his July 5 conclusion, which was an incredible exercise of damning with faint exoneration.  We cannot know the impact of the second letter; it was simply too late to be properly accounted for in pre-Election polling.  But for all of those people who were on the fence, or perhaps even weakly supporting her, this reopened the matter yet again, for one more discussion about the whole email fiasco in the final hours before the election.  Essentially, the last conversation topic about the election – for the last 12 days – was the Clinton emails.

No polls – barring hour-by-hour tracking the likes of which we have not seen -- could possibly measure that final impact.  And given her downward momentum heading into Election Day, it now seems reasonably unsurprising that the race got tighter still in those last hours, particularly in those key swing states.

So our theory is that whatever weaknesses the polls had in methodology were exacerbated by two factors:  late breaking events that was too late to measure, and a dearth of polling in a fluid, and for Hillary Clinton, downward spiraling race.

Note that we are only addressing why the polls were off in this analysis…not why Hillary lost.  That is the subject of another article we are writing now for publication in a few days.

Senate

We did quite well here, calling 32 out of 34 races correctly, including 7 out of 9 relatively close races.  We missed Wisconsin and Pennsylvania, where clearly Trump’s coattails extended to incumbent Ron Johnson, who beat challenger Russ Feingold by 3 points (we had Feingold by +2) and incumbent Pat Toomey, who held of challenger Katie McGinty by 2 points (we have McGinty by +2).  Winning those two races gave the GOP control of the Senate at 52-48.  This was actually our best showing in the Senate ever, tied with 2012 when we were correct on 31 out of 33 races.  But again, in the big picture – Senate control – we came out wrong.

Senate
Right
Wrong
% Right
Races 10+ point margin
25
0
100%
Races < 10-point margin
7
2
78%
TOTAL
32
2
94%
Close Races
Predicted Margin
Actual Margin
Right/  Wrong
Colorado
D + 10
D + 4
Right
Wisconsin
D + 2
R + 3
Wrong
Pennsylvania
D + 2
R + 2
Wrong
Nevada
D + 2
D + 2
Right
New Hampshire
D + 0
D + 0
Right
Missouri
R + 0
R + 3
Right
North Carolina
R + 1
R + 6
Right
Florida
R + 7
R + 8
Right

House

We outdid the other aggregators in the House.  Most saw the Dems picking up seats in the low double digits.  Using our exclusive regression equation, we just about nailed it, predicting the Dems would pick up +5 and they actually gained +6.  We also called 80% of the “close” races correctly.


House
Right
Wrong
% Right
Races 10+ point margin
407
0
100%
Races < 10-point margin
28
7
80%
TOTAL
435
7
98%
House Composition     (Currently 188 D/247 R)
Predicted
Actual
Diff.
Democrats
193
194
1
Republicans
242
241
-1
Net Dem Gain
D + 5
D + 6
-1

Governors

The Governor races proved to be a more difficult lot.  Essentially we did little better than flipping a coin.  We did much better in 2014 when we correctly called 31 out of 36, and hope we can match that standard at least in the crucial 2018 elections when 36 state houses will be on the ballot.

Governors
Right
Wrong
% Right
Races 10+ point margin
3
0
100%
Races < 10-point margin
5
4
56%
TOTAL
8
4
67%
Close Races
Predicted Margin
Actual Margin
Right/  Wrong
New Hampshire
D + 11
R + 2
Wrong
Oregon
D + 10
D + 7
Right
Washington
D + 8
D + 9
Right
Indiana
D + 4
R + 6
Wrong
Missouri
D + 2
R + 6
Wrong
N. Carolina
D + 2
D + 0
Right
Montana
D + 1
D + 4
Right
Vermont
R + 2
R + 9
Right
W. Virginia
R + 6
D + 7
Wrong

Bloodied but unbowed, we’ll be back in 2018 with more fearless predictions!  And I can predict one thing now with complete confidence: that polls would be back in 2018 and 2020 and will be widely followed – and hopefully with improved techniques, greater frequency, and improved understanding.



No comments:

Post a Comment

Leave a comment