The Facebook media cycle took a head-whipping quadruple turn over the past few weeks. First, in a surprise move by the Trump administration, the Department of Housing and Urban Development sued the company for violations of the Fair Housing Act, alleging it engaged in discriminatory advertising practices for housing ads.
Within days, researchers at Northeastern University published a preview of a paper describing an interesting set of advertising experiments showing that, indeed, Facebook was differentially showing ads for housing and jobs by gender and race. Lastly, in April the Algorithmic Accountability Act—AAA, naturally—was introduced in both the House and Senate, requiring large tech companies to test for evidence of discriminatory bias in the artificial intelligence that runs platforms like Facebook and Google.
This drama comes on the heels a bevy of other lawsuits alleging that Facebook abetted discriminatory advertising by allowing the targeting of housing ads by race and gender, something the company essentially admitted to doing by vowing to bar that functionality.
Antonio García Martínez (@antoniogm) is a writer and Ideas contributor for WIRED. Previously he worked on Facebook’s early monetization team, where he headed its targeting efforts. His 2016 memoir, Chaos Monkeys, was a New York Times best seller and NPR Best Book of the Year.
The HUD suit and AAA bill are really a step beyond the previous algorithmic bias lawsuits, with potentially far-reaching implications. To understand why requires digging a bit into the semantics and practice of Facebook advertising.
Start with the term targeting. As commonly used, it’s an overly broad category that can mean very different things. Within the industry, targeting refers to the data an advertiser uses to segment the audience, either data supplied by Facebook (say, 25 to 35-year-old Californians) or data that an advertiser brings to Facebook (say, the browsing and buying history of users on an ecommerce site). The key point is that whether the data comes from Facebook or the advertiser, the advertiser’s hands are on the targeting levers, deciding which set of users sees their ad.
But given the complexity of the ad buying and auction process, targeting isn’t the only thing that determines who sees an ad. Within the segment specified by the advertiser, Facebook can skew which type of user eventually views that targeted ad.
That’s typically called optimization, which is what’s at issue in the HUD suit and the AAA. Particularly for broadly targeted ads—say, every US millennial—Facebook itself further narrows the targeted set based on what it knows about the user and its platform. If the ad is about fashion, then it’ll pick users who’ve shown interest in fashion brands. If the initial run of the ad over a broad audience shows that some subsegment—say, people in Texas—engage with it more than others, then it’ll quickly start biasing the showing of the ad to Texans only.
In a world of perfect targeting, where the advertiser has absolute control of the user experience and knows all Facebook user data, no optimization would be necessary. A sophisticated advertiser could simply train the same machine-learning models that Facebook does.
In a world of perfect optimization, where Facebook knows even off-Facebook data like purchases, then no targeting would be necessary. Facebook could simply take what the advertiser knows into account about what the user has bought or browsed, and provide as near-perfect an ad experience as possible.
In reality, of course, neither targeting nor optimization is perfect, so both work interdependently: The advertiser doesn’t trust Facebook enough to hand over all of its targeting data, and Facebook doesn’t want to share its optimization data (at least not intentionally) with outsiders.
What happens in the commercial reality of ad buying is the advertiser shows up with a best guess at a target audience, some ad creative, and a bid for a desired user action such as a click or app install. Facebook uses the advertiser’s targeting to whittle down the set of potential targets, and when one such user shows up, estimates how likely they are to click on the ad or download the app.
That compound estimate—the advertiser’s targeting and Facebook’s optimized guess at an interested user among that targeted set—is what ultimately decides which ad shows up in your Facebook or Instagram feed. If the advertiser targeting is hyper-precise—say, users that browsed a specific product online—then Facebook is little more than a messenger showing the same ad to that (relatively) small group of people. If the targeting is very broad, then Facebook exerts considerable control over who sees what, potentially widening its liability in the case of something like the Fair Housing Act.
The goal of programmatic advertising—the magical thing that moves hundreds of billions of dollars a year and propelled Google and Facebook to more than a combined $1 trillion in value—has been to systematically aim ads at the most receptive segment within a large population. Antithetically, the proposed regulation calls for serving ads to every segment, irrespective of performance. It is the irresistible force of regulation colliding with the (quasi-)immovable object of trillion-dollar industry practice.
Here’s the clincher: Targeting, the part of the process that advertisers themselves bring to the party, is relatively easy to regulate. If showing housing ads based on a user’s race or ethnicity is illegal, simply prohibit advertisers from targeting by race or ethnicity for those ads. Facebook already automatically detects ads for alcohol or politics, and applies different targeting rules to them. For instance, you can’t advertise alcohol to users under 21 (or of any age in countries like Saudi Arabia) on Facebook, something it vigorously enforces.
But assuring that optimization—the part of the narrowing that Facebook’s algorithms control—shows unbiased housing and jobs ads to every racial and gender group is much harder. For example, Facebook’s algorithm may well decide to aim an ad for high-end real estate at an affluent zip code, which demographically skews white, thus violating the Fair Housing Act. Facebook may not be aware that its optimization algorithm is biased; it didn’t use race explicitly in its algorithm, after all. Nonetheless it has violated the law.
Somewhat counterintuitively, making sure ads are delivered in an unbiased manner means being more aware of user race and gender to guard against bias. The Facebook optimization team could come up with a best guess as to the gender, race, and ethnicity of every user, and make sure that the delivery of ads within certain categories like housing and employment are unbiased.
But that’s harder than it sounds.
Sure, Facebook has so-called multicultural affinity targeting categories already—Hispanic, African American, Asian American—but it’s one thing to score all 200 million-odd US users according to some affinity and offer up the top few percent as a targetable segment. It’s quite another, and much harder, to come up with a better-than-random guess as to whether every user is of this or that race or ethnicity, particularly users for whom Facebook may not have much data.
For Facebook to come up with reliable gender and ethnicity classifications for all US users, it may well have to tap outside sources of data, particularly for users on which it possesses little information. Consider the clever targeting jujitsu the Northeastern researchers used to create their targeted racial segments. How did outside academics target equal numbers of whites and blacks? Well, they pulled the public voter records from North Carolina, which apparently record the self-reported race and gender of voters, and then uploaded those lists of voters to Facebook to be targeted.
That targeting product is called Custom Audiences, and is commonly used by political campaigns during elections—usually to target supporters of a particular party (although they may well be using it for race as well). Facebook could do something similar, using public or private records from consumer data companies to match to their users and make educated guesses about their races or genders.
How do we like the idea of Facebook mining public records like voter rolls and property records to classify every American by gender, race, and ethnicity? Or even just building more sophisticated internal models, based on user behavior, to do the same labeling? Yeah, me neither. Solving the bias problem may well create a larger privacy mess if the company goes down this path.
More likely than all this convoluted hacking, Facebook may well decide to cancel any optimization for regulated industries such as employment and housing, in addition to restricting targeting by advertisers themselves. That would keep the company from helping any advertisers trying to violate the letter of the Fair Housing Act.
Then again, it would also harm legitimate businesses that attempt to reach often underserved minority markets—say, the Hispanic-oriented mortgage arm of a large national bank like Wells Fargo. Effectively, nobody could target, and Facebook would help nobody optimize: The online advertising efforts of those industries would return to the old days of scattershot advertising.
It’s worth noting that if this regulatory trend becomes well established and more generalized, it could have implications way beyond Facebook. Consider a magazine advertiser who chooses to publicize senior executive positions in male-oriented Esquire but not in female-oriented Marie Claire. Since magazine publishers commonly flaunt their specific demos in sales pitch decks, it’s easy for advertisers to segment audiences. Is that advertiser violating the spirit of the law? I would say so. Should the government enforce the law as they do with Facebook? Again, I would say so.
But how would that even work? Such old-school ad buys happen over email, the phone, and handshakes, and not in some centralized database with searchable performance and delivery stats such as Facebook.
In the early days of online advertising, the big pitch to advertisers was how trackable and targetable everything online was, unlike the dated and analog world of print and TV. That same ability to track every ad impression and pair of eyeballs also means it’s much easier to regulate. In the future, those who do the most tracking of all, namely Facebook and Google, may also be subject to the most regulation, whether they like it or not. For once, all their data will be working against them.