Skip to content

Lack of data on diverse electorate tests pollsters, politicians

That dearth of data — and cultural understanding — on increasingly diverse communities has led some pols to make naive mistakes

A voter is seen at a voting machine at the Liberty Baptist Church polling place on Election Day in Atlanta, Ga., on November 3, 2020.
A voter is seen at a voting machine at the Liberty Baptist Church polling place on Election Day in Atlanta, Ga., on November 3, 2020. (Tom Williams/CQ Roll Call file photo)

The most diverse electorate in the country’s history headed to the polls in 2020, but pollsters and party officials aren’t quite sure how to tap that potential source of political support.

They lack good data about these increasingly diverse communities, and cultural understanding to go with them, leading politicians to make naive mistakes. Like inviting Muslims over for meals against their religion.

“I can’t tell you here in Texas how many of my Muslim neighbors and friends get invitations to come to a barbecue where they’re serving pork, and how many things where I get to come to a barbecue where they’re serving beef,” said Dheeraj Chand, founding partner of strategic consulting firm, Siege Analytics, and a Hindu. “It’s well meant, but a little off-putting.”

[Push to elect Black women to Senate turns to North Carolina and Florida]

An industry of firms that provide voter data to pollsters, political campaigns and others has been trying to find the best way to parse specific parts of voters’ identities such as religion, race and ethnicity.

In 2020, voter participation across all racial and ethnic groups increased, according to data recently released by the Census Bureau. Non-Hispanic white Americans still voted at the highest rate — 70.9 percent, a 4-point increase from 2016. However, turnout among Black, Asian American and Hispanic voters increased as well — by 3, 10 and 6 points, respectively.

Overall, the electorate became 2 percent less white, 1 percent more Asian and 1 percent more Hispanic, according to the Census Bureau data.

The increased turnout of diverse communities in 2020 may force politicians and pollsters to embrace a more diverse electorate in their modeling — which influences politicking down the line, said Republican pollster Justin Wallin. Pollsters and politicians won’t know until voters return to the polls whether the higher turnout was a fluke or a sign of a permanently more diverse electorate, he said.

“I’m certainly casting a much larger net on my likely turnout universes, and I suspect most pollsters are, because they don’t know when that’s going to drop off,” Wallin said. “The last thing you want to do is get caught with some massive turnout that you don’t anticipate and suddenly lose a race. It has happened so often.”

He pointed out that political engagement with Hispanic and other voters could drastically alter the political landscape in states like California, where they could become the majority voting bloc with higher rates of voter registration and turnout.

Overall registration and turnout among nonwhite voters continue to lag behind those of white Americans, something a network of advocates, both partisan and nonpartisan, is trying to change. The advocates are also trying to get better data about voters in those diverse groups.

But that’s not easy. Most states do not provide race or ethnicity data on their voters, census information doesn’t include individual data and, on top of all that, both race and ethnicity are self-defined.

[Decline in ticket-splitting reaches beyond Congress]

“The challenge with ethnicity and race is it’s pretty muddy. We live in America, and you kind of choose. … It’s more complex than it looks on the surface, and in some cases, it’s almost a philosophical choice,” Wallin said.

‘Striking’ AAPI turnout

Asian American communities across the country can experience the problem most acutely, according to experts and activists.

Asian Americans can trace their roots back to multiple religions, four dozen countries, hundreds of ethnicities and hundreds more languages. That diversity reflects a small, densely packed portion of the electorate — but one growing quickly. Yet, experts say, it hasn’t gotten the attention it deserves from party advocates.

Census Bureau data found that about 116,000 Asian Americans voted last fall in Georgia, several times higher than Joe Biden’s 11,000-vote victory margin in the state. That’s a significant increase from the 71,000 Asian Americans who voted in Georgia’s 2016 presidential election.

Tom Bonier, CEO of Democratic data firm TargetSmart, called the surge in Asian American turnout “striking, remarkable and impactful” and critical in cementing Biden’s victory in Georgia.

That spike came despite paltry investment in Asian American and Pacific Islander turnout, according to Varun Nikore, president of the AAPI Victory Fund. On a call with reporters last month, he said less than 1 percent of the funds for voter turnout went to Asian American communities.

Problematic data can be a factor driving those decisions, Nikore said. Sometimes data brokers may lack the time or ability to break out important nuances like national origin, he said. There are also other factors contributing to what he called “AAPI voter invisibility.”

Common efforts to categorize voters, such as by last name, can understate the diversity of the electorate. A Filipino voter with a Spanish surname may get labeled as Hispanic in a data file. Additionally, an Indian American with a Catholic background and English name may be coded as white, while an Indian American with a Portuguese name may be coded as Hispanic, Nikore said.

“There’s a lot going on there, where the existing undercount coupled with lack of understanding of the magnitude of diversity of our community, leads to automated processes being incorrect,” Nikore said.

Chand said data also frequently lacks important social cues that could help in reaching out to voters. An Asian American voter who immigrated from Tibet may not be open to a visit from an ethnic Han Chinese campaign canvasser, for instance.

Pollsters learned some of these nuances as they got better data about Hispanic communities, Wallin said. A generation ago, pollsters started fielding surveys in both English and Spanish. As they did, they learned subtleties — like having Cuban Spanish speakers call Cuban communities.

Now they may have to make judgments about where certain voter communities are large enough to justify the expense and effort to field other languages.

It takes time for political campaigns to adjust to changes in the electorate, Wallin said. Republicans over the past decade have slowly built up margins among Hispanic communities, not enough to win the demographic group on its own, but enough to win an election overall.

“Campaigns are like businesses, they’re iterative. You have various business lines, you have various targeted groups, and you try and build up the share of vote amongst each one of them,” he said. “Once [campaigns] start realizing that there’s opportunity there and they can win elections there, you’ll see that more and more, and I’ve seen a lot of that over the past seven, eight years.”

Data problems

Mike Greenfield, CEO of Change Research, described “a fair amount of miscalibration” when the industry tried to forecast who would turn out in 2020.

He pointed to Florida, where Cuban American voters cast ballots overwhelmingly for Donald Trump, but other Hispanic communities did not.

Determining what drove those changes will be difficult, he said. Trying to measure those political movements with a standard 600- to 1,000-respondent statewide survey won’t tell a pollster much, he said.

“You’re not going to be able to tell meaningfully the differences that you need,” he said.

Measuring a smaller portion of the electorate takes time, money and focus, as well as smart choices.

Wallin, for instance, believes many Democratic-aligned firms oversample Spanish-speaking households, resulting in a distorted view of the Hispanic electorate.

Paul Westcott, executive vice president for voting data firm L2, said the industry previously relied on last-name matching, which uses the last name as a likely marker of race or ethnicity. On top of other problems, that method doesn’t account for a voter who took a spouse’s last name, he said.

L2 now uses a more complicated method — taking into account a person’s first, middle and last name, as well as data provided by state voter files and commercial databases. A handful of states collect ethnicity data as part of their voter files, but not all release it, or release it every year, Westcott said. His business can rely on past data releases to fill in the gaps, but “having the data directly from the states is sometimes tricky.”

“You know the voter files are great sources for name, address, vote history, in some cases party ID, but beyond that, we’ve had to look to other sources and other modeling techniques to be able to determine things like ethnicity,” Westcott said.

“It’s the way that we have to go in this age of data.”

Recent Stories

Should doctors in Congress earn money for their side job?

Supreme Court dodges definitive answer on legality of a ‘wealth tax’

Senate Finance Democrats look to raise revenue for 2025 tax cliff

Capitol Lens | Juneteenth on the Maryland campaign trail

At the Races: Trumping incumbency

Trump, Biden propel migrants to forefront of ‘contentious’ race