The midterm Senate race is fast approaching—and so are the speculations on its outcome. Previously, Lawrence W. Robinson, Professor of Operations Management at Cornell University’s Johnson Graduate School of Management used @RISK to statistically predict the senate races, using data from the stats-centered news site, FiveThirtyEight. FiveThirtyEight was founded by statistician and political analyst Nate Silver, who, in his forecasts earlier in the year, initially summed up the probabilities of either Democrats or Republicans winning all their races.
Robinson took this analysis a step further by adding Monte Carlo simulation to the mix. While Silver warned in previous articles that to assume races are uncorrelated is “dubious,” and that Monte Carlo simulations requires variables to be uncorrelated, Robinson demonstrated that it is in fact very possible to include correlation in Monte Carlo analyses.
He started by creating a “lower bound” (zero correlation) and an “upper bound” (total correlation) in his model, and showed that Democrats’ chances of retaining control only fell somewhere between 41% and 50%.
The FiveThirtyEight Approach
Fast-forward a few months, and FiveThirtyEight’s models have gotten considerably more complex and data-rich, and their interactive forecasts are updated almost daily. As of this writing, the model predicts that Democrats have a 42% chance of retaining the Senate next year.
Unlike their earlier forecasts, “they’ve also included a correlation, of a type in their model,” says Robinson. “They do not explicitly use a correlation coefficient, as I did—instead, they change the distribution of the candidate’s lead.” Robinson explains that Silver and FiveThirtyEight introduce correlation through an additional random variable representing what they’ve labelled “national error,” which they generate and add to the mean margin of victory of every candidate.
This national error “could be a sex scandal, or some underlying and largely uncaptured sentiment in the nation,” Robinson explains. “For example, in the 2012 presidential race, it might have been Hurricane Sandy, and how presidential Obama looked in response.”
In the FiveThirtyEight forecast model, if the national error (whatever it represents) turns out to be +3 for the Republicans, they shift the mean margin of victory three points towards each and every Republican. “Unfortunately, nowhere in their post do they specify the probability distribution for his new ‘national error’ random variable,” Robinson says. “Thus it is not possible to know how correlated the individual races are with one another.”
@RISK Presents an Alternative Method
Because FiveThirtyEight’s methods are not entirely clear, Robinson wanted to devise a way to arrive at these forecasts using his own statistical methods, and to use a correlation that is explicitly defined. Instead of just using FiveThirtyEights’s “Leader’s chance of winning,” which was only given to the nearest percent, Robinson started with the mean and (estimated) standard deviation of the margin of victory, and calculated the probability of winning by assuming that the margin of victory on Election Night was normally distributed. “Although Silver says he assumes that the victory margin is leptokutic [has fat tails] for finding the probability of winning, he never specifies its probability distribution,” says Robinson. “I found that the standard assumption that the margin was normally distributed better matched his reported analysis.”
Robinson then built a Monte Carlo model in Excel using @RISK, treating the outcome of each race as a Bernoulli (0/1, win/lose) random variable. He then introduced a correlation matrix that captured the correlation between every pair of races, and ran 27 different simulations (each one simulating 400,000 elections) for correlations ranging between 0% and 100%. His results closely match that of FiveThirtyEight’s, showing the probability that Democrats will retain control of the Senate as a function of the correlation among races. “Now we can say that, as long as the correlation is between 20% and 97%, the probability that the Democrats will retain control will be between 40% and 42%,” says Robinson. “The advantage of this approach,” Robinson says, “is that we specify the correlation precisely, and that we conduct robustness analysis to see how the results change with the correlation.”
Interested in playing a political prognosticator? Check out our models here and run the @RISK simulation yourself.