ADVERTISEMENT

A former Google data scientist used an odd metric to predict the unemployment rate weeks before official stats: searches for internet porn

Economist and former Googler Seth Stephens-Davidowitz explains that scientists are rethinking what constitutes "data" — it's not just government surveys.

It's data reimagined.

It takes the Bureau of Labor Statistics three weeks to collate its survey data on unemployment and release that information to the public.

ADVERTISEMENT

Three weeks might seem like a long time — but hey, it's the way things work. Even Princeton economist Alan Krueger tried and failed to speed up the process when he served as President Obama's chairman of the Council of Economic Advisors in 2011.

In his new book, "Everybody Lies," Seth Stephens-Davidowitz presents this problem as a sort of case study into how the definition of "data" is constantly being reimagined. Sure, survey responses are useful, but what if there were some other — albeit unofficial — way to track people's employment status?

To make a long story short, there is another way. It's the rate of Google searches for pornography.

ADVERTISEMENT

Let's back up to explain how this works. Stephens-Davidowitz writes that a former Google engineer created a system that used people's flu-related searches to create a picture of the current flu rate — well before the Centers for Disease Control and Prevention released official data.

While Stephens-Davidowitz was a data scientist at Google, he and Google's chief economist, Hal Varian, used a similar process to paint a picture of the national economic landscape.

By that time, Google engineers had also created a service called Google Correlate, which allows researchers to see what Google searches most correlate with the dataset they're studying. For example, Stephens-Davidowitz and Varian found that when housing prices are rising, Americans tend to search for things like "80/20 mortgage."

Stephens-Davidowitz wanted to know if Google Correlate could also help predict the unemployment rate — so he put the United States unemployment rate from 2004 into 2011 into the system.

The search term most closely linked to unemployment? "Slutload." (It's a pornographic website.)

ADVERTISEMENT

Stephens-Davidowitz explains: "This may seem strange at first blush, but unemployed people presumably have a lot of time on their hands. Many are stuck at home, alone and bored." In fact, another one of the highly correlated searches was "Spider Solitaire."

Stephens-Davidowitz is quick to note that these searches aren't necessarily the best way to predict the unemployment rate — but they can certainly be part of the prediction model.

The broader takeaway here is that traditional means of collecting data — i.e. surveys — aren't necessarily the most efficient or accurate. For one thing, as the title of Stephens-Davidowitz's book suggests, people aren't always truthful when they respond to those survey questions.

As Stephens-Davidowitz explains later in the book, that's especially true when you're looking for data on, say, the rate of homosexuality. About five percent of pornography searches by men are for same-sex pornography — which is about twice the percentage of American men who indicate on Facebook that they're interested in men.

Stephens-Davidowitz writes: "Frequently, the value of Big Data is not its size; it's that it can offer you new kinds of information to study — information that had never previously been collected."

FOLLOW BUSINESS INSIDER AFRICA

Unblock notifications in browser settings.
ADVERTISEMENT

Recommended articles

Mali ends its 11-year mission with the European Union

Mali ends its 11-year mission with the European Union

Bolstering career in machine learning: How to succeed in AI research

Bolstering career in machine learning: How to succeed in AI research

Nigerian government says no official demanded $150m bribe from Binance

Nigerian government says no official demanded $150m bribe from Binance

Zimbabwe's goal to save billions spurs an aggressive crackdown on political criminals

Zimbabwe's goal to save billions spurs an aggressive crackdown on political criminals

10 most stable countries for employees in Africa

10 most stable countries for employees in Africa

Uganda crowned best investment destination in Africa

Uganda crowned best investment destination in Africa

Namibia becomes the first African country to significantly crack HIV

Namibia becomes the first African country to significantly crack HIV

How crises from blackouts to pandemic drained $46 billion from South Africa

How crises from blackouts to pandemic drained $46 billion from South Africa

5 major African economies and their quality of life score

5 major African economies and their quality of life score

ADVERTISEMENT