This is a replica to the awesome data.fivethirtyeight.com website. Read more here

FiveThirtyEightFiveThirtyEight replica

/election-forecasts-2020

election-forecasts-2020

election-forecasts-2020

This file contains links to the data behind our 2020 General Election Forecast.

Presidential Forecast

presidential_national_toplines_2020.csv contains the final national topline on each day. It includes the following columns:

ColumnDescription
cycleThe election cycle (2020)
branchThe kind of race this forecast pertains to (presidential)
modelThe model type (polls-plus is the only model we are running for the 2020 presidential race)
modeldateDate of the model run
candidate_incName of the incumbent
candidate_chalName of the challenger
candidate_3rdName of the third-party candidate
ecwin_incChance that the incumbent will win a majority of the electoral votes
ecwin_chalChance that the challenger will win a majority of the electoral votes
ecwin_3rdChance that the third-party candidate will win a majority of the electoral votes
ec_nomajorityChance that no candidate will win a majority of the electoral votes
popwin_incChance that the incumbent will win the popular vote
popwin_chalChance that the challenger will win the popular vote
popwin_3rdChance that a third-party candidate will win the popular vote
ev_inc, ev_inc_lo, ev_inc_hiForecasted number of Electoral College votes for the incumbent, including the upper and lower bounds of an 80% confidence interval
ev_chal, ev_chal_lo, ev_chal_hiForecasted number of Electoral College votes for the challenger, including the upper and lower bounds of an 80% confidence interval
ev_3rd, ev_3rd_lo, ev_3rd_hiForecasted number of Electoral College votes for the third-party candidate, including the upper and lower bounds of an 80% confidence interval
national_voteshare_inc, national_voteshare_inc_lo, national_voteshare_inc_hiForecasted national vote share for the incumbent, including the upper and lower bounds of an 80% confidence interval
national_voteshare_chal, national_voteshare_chal_lo, national_voteshare_chal_hiForecasted national vote share for the challenger, including the upper and lower bounds of an 80% confidence interval
national_voteshare_3rd, national_voteshare_3rd_lo, national_voteshare_3rd_hiForecasted national vote share for the third-party candidate, including the upper and lower bounds of an 80% confidence interval
nat_voteshare_other, nat_voteshare_other_lo, nat_voteshare_other_hiForecasted national voter turnout based on past turnout, estimates of population growth, polls about whether voters are more or less enthusiastic about the election than usual and other factors in each state. Includes the upper and lower bounds of an 80% confidence interval.
national_turnout, national_turnout_lo, national_turnout_hiForecasted national voter turnout based on past turnout, estimates of population growth, polls about whether voters are more or less enthusiastic about the election than usual and other factors in each state. Includes the upper and lower bounds of an 80% confidence interval. Turnout estimates are only available on model runs after Sept. 5, 2020.
timestampDate and time the simulations were run
simulationsNumber of simulations run

presidential_state_toplines_2020.csv contains the final state-level toplines on each day. This sheet contains the following additional columns:

ColumnDescription
stateName of the state
tippingTipping-point chance, the chance the state will deliver the decisive vote in the Electoral College
vpiVoter power index, the relative likelihood that an individual voter in the state will determine the Electoral College winner
winstate_incChance the incumbent will win the state
winstate_chalChance the challenger will win the state
winstate_3rdChance the third-party candidate will win the state
voteshare_inc, voteshare_inc_lo, voteshare_inc_hiForecasted vote share for the incumbent, including the upper and lower bounds of an 80% confidence interval
voteshare_chal, voteshare_chal_lo, voteshare_chal_hiForecasted vote share for the challenger, including the upper and lower bounds of an 80% confidence interval
voteshare_3rd, voteshare_3rd_lo, voteshare_3rd_hiForecasted vote share for the third-party candidate, including the upper and lower bounds of an 80% confidence interval
voteshare_other, voteshare_other_lo, voteshare_other_hiForecasted vote share for other candidates, including the upper and lower bounds of an 80% confidence interval
margin, margin_lo, margin_hiForecasted margin for the incumbent, including the upper and lower bounds of an 80% confidence interval
win_EC_if_win_state_incChance that the incumbent will win the Electoral College if they win this state
win_EC_if_win_state_chalChance that the challenger will win the Electoral College if they win this state
win_state_if_win_EC_incChance that the incumbent will win this state if they win the Electoral College
win_state_if_win_EC_chalChance that the challenger will win this state if they win the Electoral College
state_turnout, state_turnout_hi, state_turnout_loForecasted state-level voter turnout based on past turnout, estimates of population growth, polls about whether voters are more or less enthusiastic about the election than usual and other factors in each state. Includes the upper and lower bounds of an 80% confidence interval. Turnout estimates are only available on model runs after Sept. 5, 2020.

presidential_polls_2020.csv contains an entry for each poll, and how much the model adjusts each poll for the house and trendline adjustments. Additional poll and poling average data can be found in our polls dataset. This sheet contains the following additional columns:

ColumnDescription
candidate_nameThe candidate for this answer choice
startdateThe first day interviews were conducted for this poll
enddateThe last day interviews were conducted for this poll
pollsterThe name of the pollster
samplesizeThe size of the sample
populationWhether the population interviewed was adults, registered voters, or likely voters
weightA relative weight that describes how much this poll factors into the forecast relative to other polls
influenceA relative weight that describes how much this poll factors into today's the forecast (similar to "weight", but also takes into account how old the poll is)
pctVoteshare for this candidate in this poll
house_adjusted_pctVoteshare in this poll after applying the house adjustment
trend_and_house_adjusted_pctVoteshare in this poll after applying both house and trendline adjustments
trackingWhether or not the poll sample overlaps with other polls in our database
poll_idUnique identifier for a poll
question_idUnique identifier for a question

presidential_poll_averages_2020.csv contains the polling averages for each day. Additional poll and poling average data can be found in our polls dataset. This sheet contains the following additional columns:

ColumnDescription
pct_estimatePolling average for the candidate listed in candidate_name on modeldate
pct_trend_adjustedTrendline adjusted polling average for the candidate listed in candidate_name on modeldate

presidential_ev_probabilities_2020.csv contains the forecasted chances of every possible Electoral College outcome. This sheet contains the following additional columns:

ColumnDescription
evprob_incChance that the incumbent wins total_ev electoral votes
evprob_chalChance that the challenger wins total_ev electoral votes
evprob_3rdChance that the third-party candidate wins total_ev electoral votes
total_evNumber of electoral votes in question

presidential_scenario_analysis_2020.csv contains the forecasted chances of various possible election outcome scenarios. This sheet contains the following additional columns:

ColumnDescription
scenario_idA unique identifier for each scenario
probabilityThe forecasted chance that the scenario will happen
scenario_descriptionA description of the scenario in question

economic_index.csv contains economic indicators that serve as inputs to the forecast. For more information on these indicators, see this post. The economic indexes were collected from the Federal Reserve Bank Of St. Louis and the stock prices data from Yahoo Finance. This sheet contains the following additional columns:

ColumnDescription
indicatorName of the economic indicator
categoryWhat that indicator helps measure
current_zscoreNumber of standard deviations from the previous 2-year average for the current value of the indicator
projected_zscoreNumber of standard deviations from the previous 2-year average for the projected value of the indicator on Election Day
projected_hiUpper bound of an 80% confidence interval for projected_zscore
projected_loLower bound of an 80% confidence interval for projected_zscore

forecast_steps.csv contains the every intermediate step in calculating the chance of winning from the polling average in a particular state. This sheet contains the following additional columns:

ColumnDescription
step_noA value from 1 - 10 where 1 is the starting point (Polling average) and 10 is the final step (Chance of winning).
value_inc, value_chal, value_3rdThe value of that step for the incumbent, challenger, and third party candidate
weightThe weight of the component when blending with either a regression or economic fundamentals
step_descriptionA description of each step in the process of calculating the chance of winning

ec_vs_popvote.csv contains the the probability that each candidate will win the electoral college conditional on the popular vote outcome. This sheet contains the following additional columns:

ColumnDescription
lower_bin_text, upper_bin_textA range of popular vote outcomes
total_ev_inc , ev_inc_lo, ev_inc_hiForecasted number of Electoral College votes for the incumbent conditional on the popular vote outcome falling between lower_bin_text and upper_bin_text, including the upper and lower bounds of an 80% confidence interval
total_ev_chal , ev_chal_lo, ev_chal_hiForecasted number of Electoral College votes for the challenger conditional on the popular vote outcome falling between lower_bin_text and upper_bin_text, including the upper and lower bounds of an 80% confidence interval
ecwin_inc, ecwin_chal, ecwin_3rd, ecwin_nomajorityChance that the incumbent, challenger, 3rd party candidate or nobody will win a majority of electoral votes, conditional on the popular vote outcome falling between lower_bin_text and upper_bin_text
countNumber of simulations in which this outcome is present

Congressional Forecasts

senate_national_toplines_2020.csv contains the final national Senate topline on each day. This sheet contains the following additional columns:

ColumnDescription
branchKind of race this forecast pertains to (senate)
expressionModel type (lite, classic, or deluxe)
forecastdateDate the model was run
chamber_Dparty, chamber_RpartyChance that each party (D or R) win control of the Senate
mean_seats_Dparty, mean_seats_RpartyAverage forecasted number of seats that each party (D or R) hold in the Senate
median_seats_Dparty, median_seats_RpartyMedian forecasted number of seats that each party (D or R) hold in the Senate
p90_seats_Dparty,p90_seats_Rparty,p10_seats_Dparty,p10_seats_Rparty90th and 10th percentile for the number of seats for each party (D or R)
total_national_turnout, p90_total_national_turnout, p10_total_national_turnoutAverage, 90th percentile, and 10th percentile of national turnout in states with Senate races
popvote_margin, p90_popvote_margin, p10_popvote_marginAverage, 90th percentile, and 10th percentile of popular vote margin (with positive being more Democratic and negative more Republican) in Senate races

house_national_toplines_2020.csv contains the final national House topline on each day. This sheet contains the following additional columns:

ColumnDescription
statesmajority_Dparty, statesmajority_Rparty, statesmajority_nopartyForecasted chances that each party, or no party, controls a majority of state delegations in the house
delegations_Dparty, delegations_Rparty, delegations_nomajorityHow many state delegations each party is expected to control in the house

senate_state_toplines_2020.csv and house_district_toplines_2020.csv contain the final state-level Senate toplines and district-level House toplines on each day. These sheets contain the following additional columns:

ColumnDescription
seatSenate seat corresponding to this row, in the format XX-S#, where XX is the state postal code and # is the class of the seat being contested
name_D1, name_D2,name_D3,name_D4,name_R1, name_R2,name_R3,name_R4Name of the top four Democrats (D) and Republicans (R) in contention for the seat. Blanks indicate that there are no Democrats or Republicans other than those listed in contention for the seat.
name_I1Name of the top candidate on the ballot that is neither a Democrat nor a Republican.
name_O1Placeholder for model chances for all candidates other than those named in the previous columns.
winner_XX, where XX is one of D1,D2,D3,D4,R1,R2,R3,R4,I1,O1Chance that the correspondingly named candidate wins the seat
winner_Dparty, winner_RpartyChance that the corresponding party, regardless of candidate, wins the seat
tippingChance that this seat is the tipping point for control of the Senate
vpiVoter power index: the relative likelihood that an individual vote in the state will determine control of the Senate chamber
mean_predicted_turnout, p90_simmed_turnout_gross,p10_simmed_turnout_grossAverage, 90th percentile, and 10th percentile of state turnout in this Senate race
voteshare_mean_XX, where XX is one of D1,D2,D3,D4,R1,R2,R3,R4,I1,O1Average voteshare for the correspondingly named candidate
p90_voteshare_simmed_XX, p10_voteshare_simmed_XX, where XX is one of D1,D2,D3,D4,R1,R2,R3,R4,I1,O190th and 10th percentile for voteshare for the correspondingly named candidate
pvi_538Partisan voter index for the state, as calculated by 538
vepTotal voting eligible population in the state
mean_netpartymargin, p90_netpartymargin, p10_netpartymarginMean, 90th, and 10th percentiles of the margin between Democrats and Republicans, where positive numbers are more Democratic and negative numbers are more Republican
won_runoff_XX, lost_runoff_XX, where XX is one of D1,D2,D3,D4,R1,R2,R3,R4,I1,O1Where applicable, chance the the correspondingly named candidate wins in a runoff for the seat

senate_seat_distribution.csv and house_seat_distribution.csv contain the probablity of each distribution of seats for each day's forecast run. These sheets contain the following additional columns:

ColumnDescription
seatsheldnumber of seats held by each party
seatprob_Dpartyprobability that Democrats will hold seatsheld number of seats
seatprob_Rpartyprobability that Republicans will hold seatsheld number of seats

senate_seat_distribution.csv also contains the following additional columns:

ColumnDescription
chamber_Dpartyprobability that each party will hold 50 seats and Democrats will control the Senate
chamber_Rpartyprobability that each party will hold 50 seats and Republicans will control the Senate

senate_fundamentals.csv and house_fundamentals.csv contain the fundamentals for each senate and house race. These sheets contain the following additional columns:

ColumnDescription
component_noNumber from 1 to 11 corresponding with each component_name
component_nameOne of the following values Incumbency,District partisanship,Incumbent's margin in last election,Generic ballot,Fundraising,Incumbent's voting record in Congress,Challenger experience,Scandals,Top-two primary margin,Number of candidates,Total
component_impactThe impact each component has on the chances of winning
component_narrativeNarrative explanation of that component for a particular race
genreNumber of Democrats and Republicans running in a race. For example, DR means that there are one Democrat and one Republican in the race, while DDR means there are two Democrats and one Republican in the race
candidateA, candidateBFull names of the first and second candidates in the race
shortnmA, shortnmBLast names of the first and second candidates in the race

senate_steps.csv and house_steps.csv contain intermediate calculation steps performed in calculating the chance of winning from the polling average in a particular state.

ColumnDescription
displaystepA value from 1-4 indicating the order in which each calculation step is performed
descriptionA description of each step
marginForecasted margin in this step for candidateA - candidateB
lite_weight, classic_weight, deluxe_weightRelatively how much of each forecast that is derived from each component during the calculation of this step

joint_probabilities.csv contains the probabilities of each possible combination of Democratic or Republican control of the Senate, House and Presidency

ColumnDescription
expression_lite, _classic or _deluxe
DsenateDhouseDpotus, DsenateRhouseDpotus, RsenateDhouseDpotus, RsenateRhouseDpotus, DsenateDhouseRpotus , DsenateRhouseRpotus , RsenateDhouseRpotus, RsenateRhouseRpotusProbability of each possible outcome for Democratic or Republican control of the Senate, House and Presidency

Files

Data Previews

presidential_national_toplines_2020.csv

Loading...

presidential_state_toplines_2020.csv

Loading...

presidential_polls_2020.csv

Loading...

presidential_poll_averages_2020.csv

Loading...

presidential_ev_probabilities_2020.csv

Loading...

presidential_scenario_analysis_2020.csv

Loading...

presidential_forecast_steps.csv

Loading...

economic_index.csv

Loading...

electoral_college_vs_popvote.csv

Loading...

senate_national_toplines_2020.csv

Loading...

senate_state_toplines_2020.csv

Loading...

house_national_toplines_2020.csv

Loading...

house_district_toplines_2020.csv

Loading...

senate_fundamentals.csv

Loading...

house_fundamentals.csv

Loading...

senate_seat_distribution.csv

Loading...

house_seat_distribution.csv

Loading...

senate_steps.csv

Loading...

house_steps.csv

Loading...

joint_probabilities.csv

Loading...