Voting systems throughout the United States are vulnerable to corruption in a variety of ways, and the federal government has an obligation to protect the integrity of the electoral process. At a recent meeting of the National Academies of Sciences, Engineering and Medicine’s Committee on the Future of Voting, the Department of Homeland Security’s Robert Kolasky put it bluntly: “It’s not a fair fight to pit Orange County (California) against the Russians.”
While the intelligence community has not confirmed that the hackers working on behalf of the Russian government to undermine the 2016 election were successful at tampering with actual vote tallies, they certainly succeeded at shaking our confidence in the electoral process, which over time could undermine faith in democracy.
The management of statewide eligible voter lists is a particularly challenging but crucial responsibility. On the one hand, data entry errors, duplicate records and “live” records for deceased voters invite voter fraud and inaccuracies in voting data. On the other hand, overly broad purging of voter lists can result in the exclusion of eligible voters from the rolls.
Two problems with voter list maintenance
Validation of voter eligibility is typically done through “matching” of individuals on voter registration lists with other databases using unique combinations of traits of eligible individuals (birthdays and names, etc.). This process is error-prone in two ways. First, data may not be entered identically for individuals across databases (misspelled names, missing data, etc.), so that individuals fail to get matched and are excluded (false negatives). Second, the computer algorithms used to identify and match records may be imprecise, such that they match the wrong people (false positives) or exclude people from voter lists based on faulty matching techniques (false negatives).
Both assumptions, that matching databases have the correct data, and that the algorithm for identifying individual matches actually does so, have proven challenging. For example, research has shown that the surprisingly high probability that two people in a group of a given size share the same birthday can largely account for inflated estimates of double registrations and people double voting. That is, an algorithm that matches on birthday, and possibly last name, is a poor method for identifying voters, because lots of people share those traits.
But even that poor algorithm assumes that the underlying data is accurate, when it is often not. Even databases containing precisely individualized identifiers, like social security numbers, include enough error to be inappropriate for matching. Indeed, the Social Security Administration accidently declares over 10,000 people dead every year, and attempts to match voter lists with the last four SSN digits have produced error rates above 20%, such that the SSA Inspector General has warned against its use for this purpose.
Sloppy matching algorithms that do not attempt to correct for such data inaccuracies are prone to exclude high numbers of eligible voters. For example, the Crosscheck system, developed by the president’s Electoral “Integrity” Commission Chair Kris Kobach, has actually produced error rates as high as 17% in Chesterfield County, VA, prompting them to abandon the software.
Two solutions that improve voter list management
The solution to these problems is thus twofold: improving the quality of matching algorithms in order to create precise identifiers and overcome data inaccuracies, and reducing the probability of ineligible voters or inaccurate data getting on the voter list to begin with.
Recent advances in algorithmic design have shown that using multiple matching criteria with recoded data to account for common data entry inaccuracies can yield matches that are 99% accurate. For example, Stephen Ansolabehere and Eitan D. Hersh have demonstrated that using three-match combinations of Address (A), Date of Birth (D), Gender (G) and Name (N), or ADGN, is extremely effective in successful matching (and helps explain how Facebook knows everything about you).
For securing and maintaining precise voter list data from the start, the implementation of automatic voter registration, or AVR, is proving increasingly effective and popular. By automatically registering all eligible adults (unless they decline) when people update their own data through government agencies, and by transferring that data electronically on a regular basis, the process “boosts registration rates, cleans up the rolls, makes voting more convenient, and reduces the potential for voter fraud, all while lowering costs” according to the Brennan Center for Justice, which advocates AVR.
Ten states and the District of Columbia have already approved AVR, and 32 states have introduced AVR proposals this year. It is a politically bi-partisan solution, with states as different as California and Alaska having already adopted the practice.
Road Island’s Democratic Secretary of State, Nellie Gorbea, has stated that “Having clean voter lists is critical to preserving the integrity of our elections, which is why I made enacting Automatic Voter Registration a priority.”
Republican Governor of Illinois Bruce Rauner, on signing his state’s AVR law, explained that “This is good bipartisan legislation and it addresses the fundamental fact that the right to vote is foundational for the rights of Americans in our democracy.”
Given the seriousness of the threat, and the fact that such effective solutions for voter list management have already been developed, Congress should ensure that states have the capacity to implement these policies, which are among the most important infrastructure investments that we can make.