“Data is the new oil” is one of those deceptively simple mantras for the modern world. Whether in The New York Times, The Economist, or WIRED, the wildcatting nature of oil exploration, plus the extractive exploitation of a trapped asset, seems like an apt metaphor for the boom in monetized data.
Antonio García Martínez (@antoniogm) is an Ideas contributor for WIRED. Previously he worked on Facebook’s early monetization team, where he headed its targeting efforts. His 2016 memoir, Chaos Monkeys, was a New York Times best seller and NPR Best Book of the Year.
The metaphor has even assumed political implications. Newly installed California governor Gavin Newsom recently proposed an ambitious “data dividend” plan, whereby companies like Facebook or Google would pay their users a fraction of the revenue derived from the users’ data. Facebook cofounder Chris Hughes laid out a similar idea in a Guardian op-ed, and compared it to the Alaskan Permanent Fund, which doles out annual payments to Alaskans based on the state’s petroleum revenue. As in Alaska, the average Google or Facebook user is conceived as standing on a vast substratum of personal data whose extraction they’re entitled to profit from.
But data isn’t the new oil, in almost any metaphorical sense, and it’s supremely unhelpful to perpetuate the analogy. Oil is literally a liquid, fungible, and transportable commodity. The global market is designed to take a barrel of oil from the Ghawar oil field in Saudi Arabia and, as frictionlessly as possible, turn it into a heated apartment in Boston or a moving commuter bus in New York. With data, by contrast, the abstract bits are functionally static.
Let’s consider this Coen brothers–esque thought experiment: Thanks to some eccentric, long-lost uncle, I inherit the Phoenix Beacon, a 50,000-ton, Panamanian-registered crude oil tanker. The thing is filled to the brim with petroleum and the captain awaits my orders. To realize my newfound wealth, I’d call refinery offloading ports and oil-futures brokers in Chicago. After much drama, I’d monetize my inheritance at the prevailing price for West Texas Intermediate light, sweet crude, multiplied by the number of barrels in the ship (minus lots of fees).
Now, let’s consider a different inheritance: Amazon sends a delivery van to my home filled with hard drives containing all its sales and user browsing data for the past year. What do I do with it?
Keep in mind, this trove is worth billions. Accounting rules don’t call (yet) for tech companies to specify their data as a separate asset on the balance sheet, but by any reasonable valuation, Amazon’s purchase data is worth an immense fortune … to Amazon.
That’s because Amazon has built an expansive ecommerce presence, a ruthlessly efficient recommendation and advertising engine, and a mind-bogglingly complex warehouse and fulfillment operation around the data on those hard drives. Ditto Google, Uber, Airbnb, and every other company you’d identify as an “oil field” in this tired metaphor.
Sure, you could maybe sell some of that data—there are companies that would love to know Amazon’s sales data or Google’s search queries or Uber’s routing and pricing history. But here’s the key thing: Those interested outside parties are competitors, and the owners of the data would never in a million years sell it. Uber isn’t selling data to Lyft, Amazon isn’t selling data to Walmart, and Airbnb sure isn’t selling user lists to Hotels.com.
But what about the market in user data we occasionally hear about, such as the sketchy business of network carriers selling your location data to middlemen, who then sell it to disreputable buyers like bounty hunters? Or mobile apps who use your location to provide you some piece of mobile utility, and then sell the latitude and longitude of your phone to other middlemen?
The Data Dividend
Indeed, there’s a market in user data, focusing on location data. Market-research firm Opimas estimates it will reach $250 million by 2020, according to The New York Times. But that’s a rounding error for the internet giants, roughly as much revenue as Google generates in 16 hours. Nonetheless, the practice is sketchy, murky, and doesn’t benefit the user. Privacy advocates and regulators are right to raise hell about this.
Where does this leave proposals around a “data dividend”? Beyond being implausible, they are problematic for several reasons.
For starters, the dividend will likely be paltry, and nowhere near the $1,600-per-person Alaska oil dividend. The annual revenue per user for Facebook globally is about $25. In the US and Canada, it’s about $130. Don’t spend it all in one place.
That’s even assuming it’s owed to you by Facebook. Many of the high-value ad placements, such as that creepy ad for the product you browsed but didn’t buy on the web somewhere, are driven by data that Facebook doesn’t own. That outside party, be it Zappos or Walgreens, engages in some data-joining acrobatics to tell Facebook whom to show ads to, but the data itself isn’t shared with the social network; advertisers don’t trust Facebook either. Users of the social media giant may think all their photos and messages are being ground into money; in fact, those items are mostly News Feed filler to keep their eyeballs there and target them via other data.
Also, there’s a serious question of how much some of this data is worth. Take thermostat data you pass along to Nest. Your gains would have to be metered somehow. How would Nest even know the per-megabyte price?
Lastly and more foundationally, why would Facebook or Google owe you anything? It’s not like Zuckerberg the Paparazzo snapped a photo of you and then monetized your image. You willfully used a service and generated data that wouldn’t otherwise exist. What you get in return is Facebook itself, for which you’ve not paid a nickel. Ditto Uber, which uses your data to optimize a tricky two-way market in riders and drivers so you have a car nearby when you open your app. Google likewise uses your searches (and resulting clicks) as a training set for its search algorithm. None of these modern marvels is cheap to maintain. You’re not contributing to some limited pool of data on whose resulting revenue you can stake a claim; you’re an infinitesimally small part of a data cooperative whose benefits accrue to the very users that generated it.
The distinction between the service that provides user value and the previously cited bounty hunters who buy trafficked location data becomes clear when considering the two biggest triumphs of privacy legislation: the European Union’s GDPR and California’s Consumer Privacy Act. Both require data handlers to gain user consent and place various administrative hurdles around the third-party use of data. A well-known app, publisher, or online store like Facebook, The New York Times, or Amazon can easily collect consent. Who doesn’t just click Accept on all the popups to get to the story or product you want? But what if some random company like LocationSmart (implicated in the bounty hunter data leak) needs to find you and collect consent? Best of luck with that.
Partly due to this legislative pressure, partly due to their failure to compete against the data majors, that third-party data ecosystem is already imploding. Acxiom, a market leader in third-party data that dates back to the direct-mail days, sold its marketing-solutions division to an ad agency. The ecosystem will be lucky to survive the coming flurry of regulation, much less grow.
As happened with the CCPA, which was inspired by a wealthy activist’s concerns around Google’s data trove, future legislation will be lobbied by the Googles and Facebooks into aiming, not at them, but at the smaller data players. This will succeed mostly in conveniently solidifying those leaders’ positions. That sounds corrupt and self-serving, and it is. But it’s the right thing for users, whose fears around their data being pimped out are legitimate in the case of third-party brokers but much less so in the case of the first-party apps they actually use.
Ultimately, the majors like Google and Facebook will raise the castle walls around their data (and users) and disclaim any knowledge of data brokering, the “data-as-oil” traders. It’ll be first-party data all around: Publishers, apps, and ecommerce all huddling around their data and user piles, projecting that data externally in data-safe ways if absolutely necessary, but not otherwise.
No, data isn’t the new oil. And it never will be, because the biggest data repositories don’t want it to be.