By Areeba Shah
The 2016 pre-election polls predicted that Hillary Clinton’s probability of winning the presidency was about 90%. When she lost, 40% of Americans lost faith in elections, according to NPR.
Yet, the 2016 polls were among the most accurate in history, said Scott Keeter, a senior survey adviser at Pew Research Center. As 2020 approaches, pollsters are changing how they average data by including education to produce more accurate results.
Voter engagement also remains higher compared with previous cycles, helping fuel more specific numbers. Close to 156 million people could vote in the upcoming elections, an increase from the 139 million who cast ballots in 2016, according to the democratic voter-targeting firm Catalist.
How polling works today
Four years ago, telephone surveys offered the most effective tool in measuring attitudes of voters, but online data-gathering systems have grown more accurate, Keeter said.
Pew relies on its American Trends Panel for political polling, which draws a random sample of all national addresses and mails residents a letter, asking them to join their online surveys. They add participants to a large national sample that pollsters reference in future studies.
Their second method includes online opt-in surveys, which consists of people who choose to take part in the interviews. While these polls are cheaper to conduct since people choose to participate (hoping to win a gift card), they often don’t represent the country as accurately.
“We don’t really know where the people have come from, and the best we can do is to try to statistically adjust the data after they’ve been collected to make them look like the country,” Keeter said.
The data include information about the individual’s age, gender, race, education level and residence to ensure it represents the population at large. By comparing its findings to census data on the same questions, Pew ensures the accuracy of its samples.
If the results don’t match, Pew adjusts its data through a process called weighting, which ensures the sample more accurately reflects the population of the entire country.
For example, if pollsters mostly interview 18-year-old high schoolers in a south Florida town, they will reduce the impact of responses from young people to more precisely represent the demographics of the town.
What’s happening with 2020 election polls
Heading into this year’s election cycle, organizations like the Siena College Research Institute, an affiliate of Siena College in New York State, are preparing to conduct high-quality polling by focusing on what the electorate looks like and measuring who turns out to vote.
In partnership with The New York Times, they are publishing live poll results on Americans’ views on political figures and key issues in six battleground states – Arizona, Michigan, Wisconsin, Pennsylvania, North Carolina and Florida.
The Siena College Poll starts with a list of registered voters’ landline and cell phone numbers. Participants are asked about their previous voting behavior to determine if they will show up to vote, said Meghan Crawford, the poll’s data management director.
In its latest battleground polls, Siena College Poll conducted interviews with 3,766 registered voters from Oct. 13 to Oct. 26 and found between 44% to 52% of them opposed the impeachment and removal of President Donald Trump in all six states.
“You don’t have to talk to the entire population,” she said. “When you want to do medical tests on somebody, you don’t have to take all of their blood. You only need to take a sample of their blood to get an understanding of it.”
Once collected, the data processes through a program that adjusts it and validates if the results represent the sample. Whether it includes registered voters, likely voters or the general U.S. population, the data are weighted to confirm that it represents a particular area.
Most pollsters adjust their data based on participants’ past voting behavior.
Kaiser Family Foundation, a health policy research organization, also uses probability-based sampling to reach everyone equally. Then, they develop a sample and use online data collection to gather information.
“We do a random sample of people within that fixed population, and we can know what’s representative of the larger population,” said Ashley Kirzinger, associate director at Kaiser Family Foundation. “It’s important to have a very strong sampling frame. We weight the data to make sure it’s representative of the U.S.”
Why the 2016 polls failed
While national polls accurately predicted Hillary Clinton’s 2.1 percentage points popular vote victory, individual state polls included errors since education wasn’t taken into account, Crawford said.
Americans often conflate individual polls (Quinnipiac) and poll aggregators (RealClearPolitics), which is an average of several poll results from different sources. This created the 70% to 99% chance that Clinton was going to win, not individual polls, she said.
Poll aggregating sometimes fails since the people averaging them are not pollsters themselves. They combine all the results from opt-in online polls and phone surveys, Keeter said.
“Here’s where the devil comes in,” he said. “Not all polls that are being aggregated are of equal quality.”
Some aggregators estimate a candidate’s chance of winning, which misinforms the public.
“Our colleagues did experimental work showing that people who were exposed to aggregator probabilities were actually discouraged from voting because they felt the election was already set even though the underlying data actually did not support that view,” he said.
The aggregators that allow people to make their own mix of ingredients, like choosing what factors to consider in that survey, are effective, Keeter said. Some of the state-level polls flipped for Trump when they didn’t weight education, thereby underestimating his support.
What to look for when consuming polling
Consider the source, its methodology and how the data was weighted. Pollsters now include each person’s level of education when averaging data, something the 2016 cycle failed to incorporate.
Another problem: The wording can push people toward a response. For example, if a self-identified independent took a questionnaire that was favorable to Republicans, and asked how they will vote at the end, they’re more likely to say Republican.
“As a consumer of polls, you need to make sure that you can see the questions that were asked in the poll because anyone who does a poll and is not willing to share it, should be subject to some suspicion,” he said.
What contributes to inaccuracies
Some Americans think because they haven’t been contacted, surveys don’t represent the entire country. But with 330 million people in the U.S., there’s a low probability that any given person will be called, Kirzinger said.
The margin of error, which measures the amount of inconsistency in a survey, also creates issues. It stems from pollsters speaking with a sample of the public rather than the whole population, which leaves room for inaccuracies. The easiest way to reduce this error involves increasing the sample size, but that’s not the only factor.
“The issue is that the errors in polls come from places other than the size of the sample,” Keeter added. “They come from choices that are made about how sampling was done and if everybody in the population had a chance to be included.”
Other factors to consider
Age, gender and race also forecast how people vote, Keeter said. Before the 2000s, age remained an insignificant characteristic, but today, this holds untrue since each age group votes differently.
Generation Z, composed of people born in the mid-to late 1990s, lean more Democratic. The older half of Generation X, those typically born between the 1960s and the mid-1980s, and most baby boomers remain likely to vote Republican.
In 2016, 55% of all Millennials identified as Democrats while 33% identified as Republicans. By comparison, 49% in Generation X and 46% of Boomers leaned Democratic, according to Pew.
While it’s harder to predict how white people will vote, African Americans, Hispanics and Asian Americans tend to vote Democrat, he added. Other groups like “shy voters,” who don’t publicly support Trump and often impair polls, are likely to vote Republican and women remain more likely than men to support Democrats.
Polls overrepresent people who are interested in politics. People who participate in them tend to turn out in higher rates than people who don’t, Keeter said.
Societies with more robust electoral systems than the U.S. have turnout rates of 80% or 90%, where measuring the likely outcome isn’t as challenging. But here, where the typical presidential election struggles to get 55% or 60% turnout, forecasting the winner is more complicated.
For example, Crawford said people who say they won’t vote can also turn up on Election Day.
“The problem that polls have in serving as a predictor is that they’re attempting to model a population that doesn’t even exist,” Keeter said. “They’re trying to describe what the group of people who show up on election day or voting absentee are going to look like prior to that actually happening.”
This requires getting the preferences right and asking people to put themselves in a frame of mind that accurately predicts if they will show up, he said.
Even as survey researchers and pollsters adapt new methodologies to more accurately measure attitudes of Americans, polling provides a snapshot of voters’ attitudes, still lingering far from being 100% accurate.