Home > The Unexplained Exit Poll Discrepancy- an impossible coincidence

The Unexplained Exit Poll Discrepancy- an impossible coincidence

by Open-Publishing - Saturday 13 November 2004
2 comments

Elections-Elected USA

Steven F. Freeman, PhD
University of Pennsylvania
November 10, 2004

stfreema@sas.upenn.edu


Most Americans who had listened to radio or surfed the Internet on Election Day this year, sat down to watch election night coverage expecting that John Kerry had been elected President. Exit polls showed him ahead in nearly every battleground state, in many cases by sizable margins. As usually happens in close elections, undecided voters broke heavily toward the challenger, and the Democratic Party, possibly better organized and more committed than ever in their history, generated extraordinary turnout.

But then in key state after key state, counts were showing very different numbers than the polls predicted; and the differentials were all in the same direction. The first shaded column in Table 1 shows the differential between the major candidates’ predicted (exit poll) percentages of the vote; the next shaded column shows the differential between their tallied percentages of the vote. The final shaded column reveals the "shift." In ten of the eleven consensus battleground states1, the tallied margin differs from the predicted margin, and in every one, the shift favors Bush.


1. These eleven states are classified as battleground states based on being on at least 2 of 3 prominent lists: Zogbys, MSNBC, and the Washington Post.

Author’s note: Given the timeliness of the subject matter, I have released this paper despite not having the opportunity to use normal academic safeguards. If you have any questions or comments, please write. Likewise, if you publish or post it to web pages, electronic bulletin boards, or other electronic archives, please let me know.

I have tried to be as rigorous as possible in my data collection, review, and analysis and I believe it compares favorably to the vast majority of commentary currently in the public domain. To hold it to an academic standard of rigor, however, requires extensive peer review; this work has barely begun to be challenged by - and improved from - this peer review process.

The media has largely ignored this discrepancy (although the Blogosphere has been abuzz), suggesting that the polls were flawed, within normal sampling error, or that it was a statistical anomaly. In this paper, I examine the likelihood of each of these assumptions:
validity of exit polls, sampling error, and the possibility of statistical anomaly.

Source of the Data

All of the 2004 exit poll data that I use here is unofficial, not meant to be released directly to the public.3 It comes from exit polls conducted for the National Election Pool, a consortium of the major television networks and the Associated Press, by two respected polling firms, Edison Media Research and Mitofsky International, whose founder Warren Mitofsky is credited with having invented the exit poll.

The pollsters have taken great pains to argue that the polls were not designed to verify election results4, but rather to help subscribers explain voting patterns and as one piece of data networks could use to "call" states. The data I use for this analysis was available
apparently only because a computer glitch allowed apparently "uncalibrated" data (not yet "corrected" to conform to announced vote tallies) to remain on the CNN website until approximately 1:30 AM election night.5 At that time, CNN substituted data "corrected" to
conform to reported tallies. I have attempted to obtain the raw exit poll data from AP, Edison Media Research, Mitofsky International, and the NY Times, but have as yet received no response.6

2 Material for this chart comes from Jonathon Simon, a former exit poll analyst, who collected and tabulated data from the CNN website before the data changed. An explanation of the how the columns were derived is presented in the "Data and Statistical Analysis" section of the paper.
3 Those who purchased the information signed agreements prohibiting the
release of the data. (Martin Plissner, "In Defense of Exit Polls: You
just don’t know how to use them" Slate Thursday, Nov. 4, 2004)
4 IBID (Plissner)





On Exit Polls
Caveats aside, the data appears to be good.7 In general, we have every reason to believe that exit polls are accurate survey instruments. Exit polls are surveys taken of representative respondents from the overall voting population. Although exit polls have not been academically studied, both the logic behind them and experience suggest that we can use these surveys to predict overall results with very high degrees of certainty. It’s easy to get a statistically valid representative sample; and there is no problem with figuring out who is going to actually vote — or how they will vote.

Thom Hartmann of CommonDreams relates that in his native Germany,

... people fill in hand-marked ballots, which are hand-counted by civil servants, watched over by volunteer representatives of the political parties. ... even though it takes a week or more to count the vote ... the German people know the election results the night the polls close because the news media’s exit polls, for two generations, have never been more than a tenth of a percent off.8


Dick Morris, Republican consultant and Fox News regular, concurs:

Exit polls are almost never wrong ...So reliable are the surveys that actually tap voters as they leave the polling places that they are used as guides to the relative honesty of elections in Third World countries. When I worked on Vicente Fox’s campaign in Mexico, for example, I was so fearful that the governing PRI would steal the election that I had the campaign commission two U.S. firms to conduct exit polls to be released immediately after the polls closed to foreclose the possibility of finagling with the returns.9


5 Richard Morin, "New Woes Surface in Use of Estimates" Washington Post, Thursday, November 4, 2004; Page A29
6 I’m not suggesting conspiracy here ­ I would hate to even imagine the volume of calls and emails they have had to manage in the past week ­ just (defensively) noting that the data that I am using is the best available.
7 Quoting Jonathan Simon, a former political survey and exit poll analyst, "his methodology was, as the night wore on, to mix in actual tabulation data with the initial pure exit poll data in such a way that by the time the full vote count was in, the exit poll' would conform very closely to theactual’ vote"... (Internet correspondence Nov 6, 2004). He notes that the data may have already been adjusted to match counts, but were probably still pure. If they already had been adjusted, it means that the pure poll numbers favored Kerry to an even greater extent.
8 Thom Hartmann, "The Ultimate Felony Against Democracy" CommonDreams.org Thursday, November 4 2004
9 Dick Morris, "Those faulty exit polls were sabotage" The Hill Nov. 4, 2004 http://www.thehill.com/morris/110404.aspx


Last fall, international foundations sponsored an exit poll in the former Soviet Republic of Georgia during a parliamentary election. On Election Day, the pollsters projected a victory for the main opposition party. When the sitting government counted the votes, however, it
announced that its own slate of candidates had won. Supporters of the opposition stormed the Parliament, and the president, Eduard A. Shevardnadze, resigned under pressure from the United States and Russia.10

Students at BYU have been conducting Utah exit polls since 1982.11 They write:

... results are very precise; In the 2003 Salt Lake County mayoral race, the KBYU/Utah Colleges Exit Poll predicted 53.8 percent of the vote for Rocky Anderson and 46.2 percent for Frank Pignanelli. In the actual vote, Anderson carried 54 percent of the vote to Pignanelli’s 46 percent.

True to their word, predictions in this year’s contests were quite accurate. In the Utah presidential election, for example, they predicted Bush 70.8%, Kerry 26.5%. The actual was Bush 71.1%, Kerry 26.4%. Consistently accurate exit poll predictions from student volunteers, including in this presidential election, gives us good reason to presume valid data from the world’s most professional exit polling enterprise.


Data and Statistical Analysis

Three critical Battleground states
The conventional wisdom going into the election was that three critical states would likely determine who would win the Presidential election — Ohio, Pennsylvania, and Florida. Sure enough, they did, with Bush winning two of three and ascending to electoral victory. In each of these states, however, exit polls differed significantly from recorded tallies.

Data (Ohio). CNN reported the exit poll as illustrated in Figure
1. Combining the male and female vote, weighted for their percentage of
the electorate (47% male), Kerry’s predicted share of the total Ohio
vote was 52.1%12. Doing the same for Florida and Pennsylvania, and
adding in final tallies (NY Times, Sunday evening), we derive Table 2.

10 Martin Plissner, Exit Polls to Protect the Vote, New York Times 10/17/04
11 http://exitpoll.byu.edu/2004results.asp. As far as I have been able to determine, this was the only other exit poll conducted on the 2004 presidential election. I had thought that Zogby also had an exit poll, but haven’t been able to verify this; they may have been using the same National Election Pool data, when they declared Kerry the winner in Ohio on Election Night. See William Douglas, "Early exit polls come under fire" The Mercury News (11/3/2004)


Figure 1. CNN graphic with apparently "uncorrected" exit poll data



12 Among the limitations of the CNN exit poll data is the lack of significant digits. Rounding errors mean that exit poll numbers for individual state analyses could be off by up to .5. This is unlikely because it comes from two groups, male and female, and it’s unlikely
that they are both rounded very much. Regardless, the strength of the finding is such that even if all numbers had been rounded the full .5 in an unfavorable direction, the basic finding would still hold.
13 Earlier exit polls, including one released by Slate at 7:28 EST, 28 minutes after the Florida polls closed showed Kerry leading 50% to 49%.





Statistical significance, which means that the discrepancy is such that it is unlikely to occur by chance, depends on three factors ­ the size of the discrepancy, the sample size, and the level of significance (just how unlikely does it have to be?) For analysis purposes, we could choose any measure: Bush’s differential, Kerry’s differential or the differential between them; it all works out the same. Based on the analysis that will follow, I’m going to examine Kerry’s percentage of the vote.

Figure 2. Statistical prediction of Kerry’s true percentage of the vote in Ohio

Figure 2 depicts a normal distribution curve14 probability density showing the relative likelihood, given this poll result, of the actual percentage of the vote he would be expected to receive in the state. The black lines below the curve indicate the poll’s statistical margin of error, the corresponding zone of 95 and 99 percent confidence. In this case, given that the exit poll indicated Kerry received 52.1% of the vote, we are 95 percent sure that the true percentage he received was between 49.8% and 54.4%. And because half of the 1 in 20 cases that fall outside the interval would be high rather than low, we’re 97.5 percent
sure that the true percentage he received was at least 49.8%. We are 99.5% sure that sure that the true percentage he received was at least 49.2%. It turns out that the likelihood that he would have received only 48.5% of the vote is less than one in one thousand (.0008).

14 This analysis assumes a simple random sample. Again, the strength of the finding is such that any modification of this assumption would not change the basic finding, but it might be somewhat stronger or slightly weaker depending on exactly how the exit polling was done. If the pollsters broke states into strata (e.g., separating counties into two or more groups by income, age, racial composition, etc..., and then randomly sampled within each strata, then the variances would be reduced and an even stronger case can be made. If on the other hand, states were broken into clusters (e.g., precincts) and then clusters (precincts) were randomly selected (sampling individuals within those selected precincts), the variances would increase. Much survey sampling uses a combination of clusters and strata, and I do not know how this sample was conducted.


Conducting the same analysis for Florida, we find that Kerry’s 47.1% of the vote is likewise outside the 99% confidence interval. The likelihood of his receiving only 47.1%, given that the exit polls indicated 49.7%, is less than three in one thousand (.0028). Kerry’s count is also outside the 99% confidence interval in the third critical battleground state, Pennsylvania. Although he did carry the state, the likelihood of his receiving only 50.8% given that the exit polls indicated 54.1% is less than two in one thousand (.0018).

The likelihood of any two of these statistical anomalies occurring together is on the order of one-in-a-million. The odds against all three occurring together are 250 million to one. As much as we can say in social science that something is impossible, it is impossible that the discrepancies between predicted and actual vote counts in the three critical battleground states of the 2004 election could have been due to chance or random error.

Official Explanations
The New York Times tells us that they obtained a report issued by the pollsters that debunked the possibility that the exit polls are right and the vote count is wrong15, but does not explain beyond that declaration how the possibility was "debunked." In fact, no evidence at all is presented of skewed survey data or any problems at all with the polls except that "uncorrected" data was released to the public. Slate reports that Mitofsky and Lenski insist that the polls were perfectly fine.16 17

15 Jim Rutenberg, "Report Says Problems Led to Skewed Surveying Data" New York Times, Nov. 5, 2004
16 Martin Plissner "In Defense of Exit Polls: You just don’t know how to use them. Slate Thursday, Nov. 4, 2004
17 Jack Shafer, "The official excuses for the bad exit poll numbers don’t cut it." Slate Friday, Nov. 5, 2004, 9:23 PM PT




One of the few people close to the pollsters to offer an explanation early on was Martin Plissner, former executive political director of CBS News (and self-described close friend of the pollsters), who identifies three problems with the polls:

The pollsters who work outside the polling stations often have problems with officials who want to limit access to voters. Unless the interviews have sampled the entire day’s voters, the results can be demographically and hence politically skewed. Finally, it is of course a poll, not a set of actual recorded votes like those in the precinct samples collected after the polls close.18

Regarding the first problem, voters contacted in such precincts can be weighted. Jack Shafer of Slate observes:

... exit pollsters always encounter overzealous election officials enforcing electioneering laws. Can we really believe that a significant number of the 1,480 exit poll precincts in 50 states and the District of Columbia that Edison/Mitofsky surveyed on Election Day were so affected? And in sufficient numbers to bend state-by-state exit polls in Kerry’s favor?19


Regarding time of day variation, this paper does not refer to mid-day reports, but rather end of day data, which happened to still be available at midnight. But even if there were an early voter bias, is there any reason to believe that early votes would be skewed Democrat? Stereotypically, Republicans are early risers.

Regarding the last ditch argument that it’s just a poll, true (of course), but, as I have documented, the evidence and logic on exit polls suggest that we have every reason to believe they are accurate within statistical limits.

Under-representation?
Other explanations put forth by the Washington Post charge that samples may have included too many women, too few Westerners, not enough Republicans, etc ..." Regarding the first part of this critique, Morris writes:

The very first thing a pollster does is weight or quota for gender. Once the female vote
reaches 52 percent of the sample, one either refuses additional female respondents or weights down the ones one subsequently counted. This is, dear Watson, elementary.

18 Martin Plissner "In Defense of Exit Polls: You just don’t know how to use them. Slate Thursday, Nov. 4, 2004
19 Jack Shafer, "The official excuses for the bad exit poll numbers don’t cut it." Slate Friday, Nov. 5, 2004, 9:23 PM PT





Moreover, the issue of male/female ratio is irrelevant. CNN and others released data presenting male and female preferences separately, thus automatically weighting sex appropriately.

Other potential imbalances are part of normal sampling error. A random sample would result in the poll precision and confidence intervals that I reported. Under such conditions, Republicans, westerners, etc., are equally (un)likely to be over- or under-represented. Imprecise representation is incorporated within the margin of error. (That’s why we have the concept of probability densities, margin of error, etc.... If you could choose a perfectly representative sample, you could predict outcomes precisely.) In theory, techniques to ensure sample representativeness20 make the exit polls be even more accurate than my analysis indicated, thus making the observed discrepancies even more unlikely.

Bush voter unwillingness to participate and other "explanations"
Most recently, Senior Gallup Poll Editor David W. Moore, report that Mitofsky and Lenski say that,

Kerry voters apparently were much more willing to participate in the exit poll than were Bush voters. The interviewers at each of the sample voting locations are instructed to choose voters as they exit from the voting booth — every third, or fifth, or eighth voter — some sequence of this sort that is determined ahead of time. Many voters simply refuse to participate in the poll. If therefusers are disproportionately for one candidate or another, then the poll will be biased....21


OK, true enough. If Republicans disproportionately refuse to participate, that could explain exit poll error. But do we have any reason to suspect that?

It is conceivable that Kerry voters were much more willing to participate in the exit
poll than were Bush voters, but although it’s not difficult to imagine why some Bush voters might not want to participate, it’s also not difficult to imagine why some Kerry voters might not want to participate either.

20 Pollsters normally either choose precincts so as to ensure representative samples, or weight respondents by key demographic categories. The Utah Colleges Exit Poll website gives a fairly good basic explanation of polling practices:
http://exitpoll.byu.edu/about/survey_sampling_faq.asp and
http://exitpoll.byu.edu/about/sample_design.asp.
21 David W. Moore, Senior Gallup Poll Editor, "Conspiracies Galore" Gallup News Service: November 9, 2004.





The problem with this "explanation" or even one that would have considerably more "face validity" (which means that it makes sense on the face of it), such as the possibility that absentee/early voters disproportionately favored Bush22, is that it is not an explanation, but rather a hypothesis. It’s apparent that "Kerry voters were much more willing to participate in the exit poll than Bush voters" only given several questionable assumptions. An explanation would require independent evidence.23

The Role of the Exit Poll
The pollsters have made clear that the purpose of their poll was not to verify the integrity of the election. They were hired by the AP-Network consortium to provide supporting data for election coverage. Nevertheless, verifying elections is not only important in Mexico, Venezuela, and Georgia (the former Soviet Union Republic, not the US State.) Whatever the original purpose of this particular exit poll, it could be used to help verify election integrity if it were released.24

In this case, concerns about this exit poll-count discrepancy are compounded by concerns about voting technologies, both new (especially electronic voting machines without paper trails) and old (punch card ballots still widely in use). Allegations about miscount and worse have been popping up on the Internet since the election like daffodils on a suburban lawn in April. In at least two cases, vote count errors have been acknowledged and corrected.25 Additional sources of concern include mistabulation through "spoilage," (as we saw in Florida in 2000, large numbers of votes can be lost due to imperfections in the voting process), overuse of provisional ballots, and limited access by observers to some vote tallies.26

22 To the best of my knowledge, the pollsters have not offered absentee/early voters as an "explanation," presumably because they were able to predict any disproportionate support based on previous elections.
23 I could imagine various types of supportive evidence. One possibility would be verifying sampled results versus actual voting patterns in random sample precincts where counts are unimpeachable.
24 I do not know the details of the contractual arrangement, so I do not know who actually "owns" this data.
25 "Glitch gave Bush extra votes in Ohio" cnn.com 11/05/04.
http://www.cnn.com/2004/ALLPOLITICS/11/05/voting.problems.ap/
26 Erica Solvig, "Warren’s [Warren County, Ohio] vote tally walled off" Cincinnati Enquirer Friday, November 5, 2004




Summary and Implications
My purpose in this paper, however, has not been to allege election theft, let alone explain it. Rather, I have tried to demonstrate that exit poll data is fundamentally sound, that the deviations between exit poll predictions and vote tallies in the three critical battleground states could not have occurred strictly by chance or random error, and that no solid explanations have yet been provided to explain the discrepancy. In short, I have tried to justify the discrepancy as a legitimate issue that warrants public attention.

The unexplained discrepancy leaves us with two broad categories of explanations: the polls were flawed or the count is off. The most important investigations concern verification of the tallies and allegations of fraud on one side; and examination of the exit poll’s methodology and findings on the other. Some useful statistical analyses would compare the "shift" in battleground states vs. non-battleground states, and in states, counties and precincts where safeguards are strong vs. those where they are suspect. Obviously, if the polling consortium would release their data, that would allow us to do more definitive analyses.

Given that neither the pollsters nor their media clients have provided solid explanations to the public, suspicion of fraud, or among the less accusatory, "mistabulation," is running rampant and unchecked. That so many people suspect misplay undermines not only the legitimacy of the President, but faith in the foundations of the democracy.

Systematic fraud or mistabulation is a premature conclusion, but the election’s unexplained exit poll discrepancies make it an unavoidable hypothesis, one that is the responsibility of the media, academia, polling agencies, and the public to investigate.


END -----------------------

Dr. Freeman is on the faculty of the University of Pennsylvania; his areas of expertise include resilience, innovation, and research methods. He obtained his Ph.D. from the Massachusetts Institute of Technology. Contact him at stfreema@sas.upenn.edu.

v00k

http://www.buzzflash.com/alerts/04/11/ale04090.html

Forum posts

  • Bravo, Dr. Freeman. Is there a table of data available on the statistical vs. tabulated differences in the non-battleground states? I understand that similar discrepancies were not seen there, but I have not seen the actual data.

    Thanks again for your detailed analysis. In the absence of explanations to the contrary, it seems disingenuous to cast aside the possibility of mistabulation and fraud the way the mainstream media has done thus far.

  • Exit polls have been overpredicting Democratic voting percentages at the Presidential level for the last several elections. He also doesn’t deal with the problem of early voting and absentee voting, they would not be picked up by exit polling on the morning of the election. As many as one third of voters voted this way and the tendency for absentees is to tilt slightly Republican.