The ELection Violence Intelligence System (ELVIS) is an early warning approach to predicting the risk of election violence for every major national election across the globe. ELVIS provides regularly updated forecasts for each election in the coming month, alongside an annual risk table and an archived risk table that dates back to 2000. We utilize an ensemble machine learning ecosystem to produce (1) predictions, (2) predictor importance measurements, (3) prediction/outcome relationship visualizations, and (4) evaluations of predictive accuracy.
ELVIS was developed at the One Earth Future Foundation under the research department with the intent of providing both substantive and actionable information about the risk of election violence for upcoming elections.
Like our CoupCast project, we utilize original data alongside ensemble machine learning to forecast the risk of election violence. As a result, ELVIS contains three major components:
Data: We continue the use of coding rules used by the National Elections Across Democracy and Autocracy (NELDA) dataset to code relevant national elections each year. NELDA stopped coding election events ending in 2012, so we have expanded the data into the end of 2019. We code general election violence and government harassment/violence using the same rules as NELDA, and continue to code these outcomes for each new election event. Our predictor variables come from a combination of environmental, economic, social, and political measures that are found in our Rulers, Elections, and Irregular Governance (REIGN) dataset.
Modeling: We use a combination of methods to produce more accurate predictions of election violence. First, we start by implementing a rolling origin cross validation procedure for training our algorithm by taking into account temporal sequence. This allows us to avoid training on future data. Second, we use an ensemble of three classification algorithms. To do this, we use a greedy optimization procedure on top of a (1) random forest, (2) logistic regression, and (3) a neural network.
You can download our monthly updated training data for ELVIS here. (UPDATED APRIL 5TH 2019)
If you wish you use the ELVIS data for your own projects, we recommend the following citation:
If you are citing ELVIS, we also encourage the citation of the following data as well:
Susan D. Hyde and Nikolay Marinov, 2012, Which Elections Can Be Lost?, Political Analysis, 20(2), 191-201.
Bell, Curtis. 2016. The Rulers, Elections, and Irregular Governance Dataset (REIGN). Broomfield, CO: OEF Research. Available at oefresearch.org
For any questions concerning ELVIS, our modeling, or data, please contact Clayton Besaw (cbesaw@oneearthfuture.org)
For each year, we supply both an annual risk table that is updated at the beginning of the year (2019 ANNUAL RISK TABLE) and a monthly updated risk table for remaining elections (REMAINING 2019 NATIONAL ELECTIONS). Below we provide a brief description of each variable displayed, but feel free to contact Clayton Besaw (cbesaw@oneearthfuture.org) with any additional questions.
Country: Country in which the election event is taking place.
Election Date: Date that election event is set to take place in month/day/year format. This information is likely to change as election dates are often volatile or delayed.
Percentile: The percentile ranking for each election event based on the complete distribution of election violence forecasts.
Risk Change since January: This is the change in the probability of election-related violence between the monthly updates and the base-line forecasts made at the beginning of the year (monthly risk table only).
Outcome: Dichotomous classification of whether an election event was peaceful or experienced election-related violence before/during/after the event. Updated as election events are completed (annual risk table only).
This archive provides information on over two thousand unique national election events regarding their dates and whether election-related violence occured before/during/after the event. You can use the search box to look up specific countries, and the columns can be used to sort each variable according to their measurement type. If you wish to have access to the full training data, which includes this information, you can download it here.
Country: Country in which the election event is taking place.
Election Date: Date that election event is set to take place in month/day/year format. This information is likely to change as election dates are often volatile or delayed.
Outcome: Dichotomous classification of whether an election event was peaceful or experienced election-related violence before/during/after the event. Updated as election events are completed.
Because our system tires to classify the likelihood of election violence occuring, we use measurement of accuracy called the area under the curve (AUC) score. AUC scores range from .5 (random guessing) to 1 (percent accuracy), and can be interpreted using the following heuristic.
Using rolling origin cross-validation, on historical data (1975 - 2017), we obtained an average AUC score of 0.88 corresponding to an average accuracy of 77 percent of elections correctly classified. The figures to the right display the change in AUC scores for out-of-sample data into 2017. As expected, the inclusion of more data over time has resulted in impressive AUC scores (0.91 in 2016 and 0.91 in 2017) for recent election years. Overall, our model has a good track record on historical data and suggests an actionable level of accuracy for each new slate of national elections.
As of October 26th, 2018, our model has achieved an AUC score of 0.83 corresponding to an accuracy of 80 percent.
The bar-plots to the right display the variable importance for each predictor included in our model. Variable importance is an estimate of how much each individual variable contributes to predictive accuracy. This is achieved by running multiple permutations of our forecasting models in which every variable is replaced with random noise. If accuracy decreases when a variable is replaced, then that variable is deemed more important for predictive accuracy. Our greedy ensemble classifier allows us to utilize the idiosyncratic nature of individual classifiers to built an overall measure of predictor importance. Each predictor is briefly described below:
GDP per Capita: Measurement of GDP per Capita for the country-year of the election event.
Population: Logged measure of population for the country-year of the election event.
Infant Mortality Rate: Logged measure of infant mortality rate for the country-year of the election event.
Coup Risk: Estimated risk of a military coup in the month of the election. Taken from REIGN.
Quality of Democracy: Measure of high quality democracy (10) or low quality/authoritarian (-10) for the country-year of the election event. Taken from Polity IV.
Economic Growth: Percent growth/decline in GDP for the country-year of the election event.
Relative Precipitation: Estimated relative level (SPI) of rain-fall in the month of the election. Base data taken from NOAA’s PRECipitation REConstruction over Land (PREC/L) data.
Regime Tenure: Number of months a regime has been in power during the month of an election. Taken from REIGN.
Political Competition: Level of political competition for the country-year of the election event. Taken from Polity IV.
Executive Constraints: Level of constraints on the political executive for the country-year of the election event. Taken from Polity IV.
Time Since Last Election: Number of months since the last successful election during the month of an election. Taken from REIGN.
Election in Next Six Months: Dichotomous measure of whether another election is expected within six-months of an election event. Taken from REIGN.
To further provide actionable information about ELVIS, we utilize partial dependence plots as a way to visualize the relationship between the values of our predictors and the expected risk of election-related violence. Partial dependence plots are similar in concept to the marginal effect. Essentially, these calculations can tell us the expected relationship between any one predictor variable and the expected classification of election-related violence while controlling for all other predictors simultaneously. For brief descriptions of each predictor, please see our variable importance page. For a brief guide on interpretation, see the description below.