Mebane: A Layman's Guide to Statistical Election Forensics

May 27, 2010, 8:14 p.m.

Election forensics is a name coined to describe a nascent field of social science intended to develop statistical methods that can be used to verify whether election results are accurate. A concern with an election's soundness is not new. In an analytical sense one can trace attention to the fairness of elections back to Condorcet writing at the end of the eighteenth century in France. Saltman (2006) traces the development of technology to try to ensure the integrity of elections in the United States back to the nineteenth century. During the 1960s computing techology began to be used in elections, followed in short order by statistical error detection procedures such as California's requirement that one percent of ballots must be recounted manually. ((Saltman 2006: 162-167)) After the 2000 presidential election in the United States, concerns about technical and administrative failures in elections increased dramatically in light of their having been responsible for George Bush taking office. ((Wand, Shotts, Sekhon, Mebane, Herron, and Brady 2001; Mebane 2004)) By 2004 the idea of consequential failures had evolved into beliefs about widespread election fraud. ((Miller 2005)) American computer scientists raised alarms about the unreliability of the computerized voting systems that were being increasingly deployed around the country. ((Rubin 2006)) In this environment election forensics emerged as a collection of methods intended not only to detect election inaccuracies but also to diagnose attempted election fraud. Having methods to detect fraud, some hoped, might help to deter fraud. ((Alvarez, Hall, and Hyde 2008))

Fear of fraudulent elections is of course not limited to U.S. elections. With the spread of democratization late in the twentieth century, elections occurred in more and more countries. As Bjornlund (2004) describes, an increasing number of governmental and nongovernmental organizations became involved in election monitoring. Such monitoring is based primarily on election observation, often supported by methods such as parallel vote tabulation (PVT). ((Estok, Nevitte, and Cowan 2002)) But election monitoring is usually more focused on the conditions under which elections are conducted - with whether they are “free and fair” - than with whether they are accurate.

The emphasis on election accuracy is what is new in the idea of election forensics. The question is whether the preferences of the electorate have been correctly translated into the election outcome.

Translating preferences into election outcomes is not a simple matter. What we learn from Condorcet and subsequently from the field of social choice theory is that mappings from preferences to collective choices are very complicated. ((Saari 2001)) A third candidate on the ballot can change the relative outcome between two other candidates. Voters may act strategically, voting for a candidate they prefer less than another candidate to try to defeat yet another candidate they like even less. ((Cox 1994)) Campaigns can change voters' beliefs about what is likely to happen and therefore cause voters to act differently. Voting is not at all simple. ((Riker 1982))

The complexity of election returns motivates the method proposed by Myakgov, Ordeshook, and Shaikin (2009). Using especially data from recent Russian and Ukrainian elections, they focus on deviations from a unimodal distribution of voter turnout. When deviations are found, they try to interpret what happened using whatever scraps of information are available.

Other methods look at the digits in vote counts. Beber and Scacco (2008) propose that the least significant digits of vote counts should be uniformly distributed if they are produced by natural processes but not if they are faked. But many kinds of problems with election counts, such as machine failures, may be natural. While important, the range of cases potentially covered by this method seems limited.

Another method based on vote counts' digits looks at the second significant digits of low-level vote counts, such as precinct vote counts, and asks whether the digits follow the pattern specified by Benford's Law. ((Mebane 2006)) This method has been used to study Russian elections, ((Mebane and Kalinin 2009, 2010)) an Iranian election, ((Mebane 2010b)) elections in Mexico, ((Mebane 2010b)) and other elections. In Mebane (2010), a covariate (the proportion of invalid ballots) was crucial in diagnosing an important kind of fraud in the 2009 Iranian presidential election. In general it appears that even though an unconditional analysis of vote counts' second digits can sometimes be informative, in order to diagnose what happened in an election it can be essential to associate the digits with an appropriate covariate - even the apparent margin of victory in the race. The frontier here is determining the effect complications such as strategic voting and gerrymandering and alternative voting rules such as plurality voting or proportional representation can have on the digit distribution. All seem to have systematic effects on the distribution of vote counts' second digits, and a question is how the resulting patterns differ from those induced by various kinds of election fraud. ((Mebane 2010b)) The differences are clear in some cases, not so clear in other cases. Digit-based methods may be important especially in circumstances where other, richer sources of information are not available.

The question of accuracy goes well beyond concerns with whether each cast ballot is counted in the way the voter intended, but obviously faithfulness and correctness in counting is essential. One area of election forensics focuses on so-called post-election audits, in which a random sample of ballots is manually tabulated with tallies being compared to official election outcomes. ((Norden, Burstein, Hall, and Chen 2007)) These methods are motivated by skepticism about the reliability of electronic machine counts. For credibility, these audits require that votes be recorded with a voter verified paper audit trail and maintained with a sound chain of custody. All the ballots produced for the election must be accounted for. Feasible procedures have been developed to conduct such audits so that the idea of confidence in the vote count is tied directly to the idea of a conventional statistical hypothesis test. ((Hall, Miratrix, Stark, Briones, Ginnold, Oakley, Peaden, Pellerin, Stanionis, and Webber 2009)) Unlike PVT, these methods are intended to check the original ballot tabulation and not primarily how tabulations are being reported. The adminstrative requirements to support such procedures are considerable.

The investigation of technological flaws in the machinery of voting is another aspect of election forensics. A review in California found defects in all the electronic voting systems that were examined, even those used to count optically scanned paper ballots, prompting the Secretary of State to withdraw approval of several voting systems. ((Bowen 2007)) Investigations of this kind highlight limitations of any statistical analysis. In the case of the U.S. presidential election of 2000, considerable effort was devoted to demonstrating the pivotal role of ballot formats, but a similar effort to blame ballot format for about 18,000 undervotes recorded in the 2006 U.S. House election in Sarasota, Florida, attracted a skeptical response that claimed that the role of electronic hardware failures was being understated. ((Frisina, Herron, Honaker, and Lewis 2008; Mebane 2008)). Only suitable physical testing of all the equipment used in the election could resolve the question, but the testing that was done did not do this and so failed to be convincing. ((GAO 2008))

Allegations of problems induced by ballot format move some way from concern with pure counting to focus on things that interfere with voters' efforts to act on their intentions. The analysis in Wand et al. (2001) ((Wand, Shotts, Sekhon, Mebane, Herron, and Brady 2001)) was based on this line of thinking. A technique introduced there uses a robustly estimated regression model for vote counts to identify outliers: places that do not relate to a set of covariates in the same way as most of the other places do. ((See also: Mebane and Sekhon 2004)) In one case the covariates were functions of previous election results and of demographic variables and the places were counties. The question was how unusual the result was in Palm Beach county. A limitation of the outlier detection methodology is revealed in a comparison between Iran and the United States: when the covariates are functions of previous election results, there are many outliers whenever candidates who get small numbers of votes are included. ((Mebane 2010b)) T his reflects the fact that voters' intentions comprise both preferences and considerations of strategy. While voters' preferences may be very similar in successive elections, the strategic situation can be very different. Classic strategic voting produces substantial changes in the proportion of votes losing candidates receive.

Election forensics is an active area of research. The methods reviewed here do not comprise an exhaustive list. Some important aspects of the general idea of election fraud differ from the notion of accuracy I have emphasized and may go beyond the scope of what a statistical analysis can reveal. ((Lehoucq 2003)) For example, statistical methods may have little to say about the situation where candidates are forcibly denied access to the ballot. But there is every prospect that soon we will have methods that provide objective standards for saying in many circumstances whether election results are accurate.

Walter R. Mebane, Jr., is Professor of Political Science and Professor of Statistics at the University of Michigan, Ann Arbor. He is working on a book manuscript entitled Election Forensics. Download a PDF of this article and its references here.

comments powered by Disqus