‘I Want to Burn Things to the Ground’

Kevin Van Aelst for The Chronicle

The mic is passed and the psychologists rise, one by one, to explain why they’re here. Some reasons are kind of funny ("I’m here because it was better than sitting in my office and swearing"); others are heartfelt ("I’m here because I want to trust science again"); a few come tinged with regret ("I’m here to atone for the sins earlier in my career"). And some betray frustration, even anger: "I’m here," one researcher says, "because I want to burn things to the ground."

The mission statement of the Society for the Improvement of Psychological Science doesn’t mention anything about arson. Instead it states, in more measured tones, that the organization, known as SIPS, is dedicated to rigor, openness, and the "refinement of knowledge." The couple hundred attendees at its fourth annual gathering, held recently in Grand Rapids, Mich., were mostly in their 20s and 30s, plenty of postdocs and assistant profs, along with a sprinkling of senior academics. They listened to presentations with nonthreatening subjects like "Using information-theoretic approaches for model selection" and "Assessing the validity of widely used ideological instruments." There was enthusiastic chatter about a new project, called the Psychological Science Accelerator, that involves multiple laboratories coordinating data collection. All of which sounds serious, scholarly, and completely harmless.

So what’s with the talk of burning things to the ground?

As you’ve no doubt heard by now, social psychology has had a rough few years. The trouble concerns the replicability crisis, a somewhat antiseptic phrase that refers to the growing realization that often the papers published in peer-reviewed journals — papers with authoritative abstracts and nifty-looking charts — can’t be reproduced. In other words, they don’t work when scientists try them again. If you wanted to pin down the moment when the replication crisis really began, you might decide it was in 2010, when Daryl Bem, a Cornell psychologist, published a paper in The Journal of Personality and Social Psychology that purported to prove that subjects could predict the future. Or maybe it was in 2012, when researchers failed to replicate a much-heralded 1996 study by John Bargh, a Yale psychologist, that claimed to show that reading about old people made subjects walk more slowly.

Some of the field's rock-solid findings now appear sketchy at best. Entire subfields are viewed with suspicion. Many, perhaps most, studies published are flawed.

And it’s only gotten worse. Some of the field’s most exciting and seemingly rock-solid findings now appear sketchy at best. Entire subfields are viewed with suspicion. It’s likely that many, perhaps most, of the studies published in the past couple of decades are flawed. Just last month the Center for Open Science reported that, of 21 social-behavioral-science studies published in Science and Nature between 2010 and 2015, researchers could successfully replicate only 13 of them. Again, that’s Science and Nature, two of the most prestigious scientific journals around.

If you’re a human interested in reliable information about human behavior, that news is probably distressing. If you’re a psychologist who has built a career on what may turn out to be a mirage, it’s genuinely terrifying. The replication crisis often gets discussed in technical terms: p-values, sample sizes, and so on. But for those who have devoted their lives to psychology, the consequences are not theoretical, and the feelings run deep. In 2016, Susan Fiske, a Princeton psychologist, used the phrase "methodological terrorism" to describe those who dissect questionable research online, bypassing the traditional channels of academic discourse (one researcher at SIPS, who asked not to be identified, wore a T-shirt to the conference emblazoned with the words "This Is What a Methodological Terrorist Looks Like"). Fiske wrote that "unmoderated attacks" were leading psychologists to abandon the field and discouraging students from pursuing it in the first place.

Psychologists like Fiske argue that these data-crunching critics, like many of the attendees at SIPS, paint far too dark a portrait of the field. Yes, there are lousy studies that slip through the peer-review net and, sure, methods can always be improved. Science progresses in fits and starts, with inevitable missteps along the way. But they complain that the tactics of the reformers — or terrorists, take your pick — can be gleefully aggressive, that they’re too eager to, well, burn things to the ground. The handful of researchers who make it their mission to unearth and expose examples of psychology’s failings come in for particular scorn. As one tenured professor I spoke with recently put it, "I think they’re human scum."

James Heathers is a jovial, bearded Australian who loves cats. He is a postdoc at Northeastern University with a Ph.D. in cardiac psychophysiology; when he’s not ranting about subpar research practices on Everything Hertz, the podcast he co-hosts, he’s hunting for connections between emotion and heartbeat variability. He’s been working, along with his fellow data thugs — a term Heathers coined, and one that’s usually (though not always) employed with affection — on something called Sample Parameter Reconstruction via Interactive Techniques, or SPRITE. Basically, SPRITE is a computer program that can be used to see whether survey results, as reported in a paper, appear to have been fabricated. It can do this because results usually follow certain statistical patterns, and people who massage data frequently fail to fake it convincingly. During a SIPS session, Heathers explained SPRITE with typical élan: "Sometimes you push the button and it says, ‘Here’s a forest of lunatic garbage.’ "

The data thugs tend to be portrayed as fringe academics who get their jollies from viciously mocking terrible science. And there is some of that. One evening at the SIPS conference, after sessions had been concluded and beers consumed, a researcher asked Heathers for his opinion of a study that seemed suspicious. When Heathers decided he’d heard enough to render a verdict, he took a few steps back and began to shout.

"Give me a B!"

"B!" the assembled scientists replied.

"Give me a U!"


"Give me an L!"


"Give me an L!"…

There are more sedate ways of declaring your doubts about a study, but few more memorable than spelling out a profanity cheerleader-style.

That irreverence, however, shouldn’t be mistaken for a casual attitude toward their work. These guys stay busy. Along with SPRITE, they’ve come up with another piece of software, called GRIM, which stands for Granularity-Related Inconsistency of Means, that tests to see whether certain figures reported in a paper are mathematically possible. If they’re not, then the authors either miscalculated or made it up. No one asked these freelancing scientists to create these tools. Journals and authors routinely refuse to cooperate with them. As career moves go, there are savvier ones than telling people who might hire you that they’re full of it.

"There’s no organization, there is no structure, there is no backup, no retirement plan, no money, no incentives, and often there is not even a whole lot of agreement between us," Tim van der Zee, a doctoral student at Leiden University, in the Netherlands, told me. He helped uncover and catalog the stunning multitude of discrepancies in the work of Brian Wansink, a Cornell University psychologist whose speciality is food research. The takedown of Wansink, which led to retractions and a continuing investigation by Cornell, was a group effort that exposed slipshod methods and sloppy math going back years. In one sense, it was the story of what went wrong in a single laboratory. But it also said something about the field. Why did it take a bunch of part-time outsiders to find the truth? Where was Cornell? Where were the editors and reviewers?

And how many other Wansinks are out there?

Van der Zee describes himself as "an utterly irrelevant Ph.D. student at some random university who just happened to have stumbled upon errors and decided to say something about it." That’s the back story of most data thugs, including Jordan Anaya, who participated in the skewering of Wansink and worked to improve GRIM. Anaya is a biologist by training, a self-taught programmer, and the other data thugs say he’s the strongest mathematician among them. He’s also unemployed at the moment. He dropped out of a graduate program at the University of Virginia because he felt that the scientific standard was "not high enough for me to continue wasting any more of my time there." At least that’s what he wrote in the "About" section of his website, Omnes Res ("all things"). When I met him at a mall near his apartment in Baltimore, he told me he was looking for a university job, thinking about his next move. He spends his days playing basketball, working out, and devoting hours and hours to his unpaid gig as a soldier in the replication rebellion. Not long ago he posted a photo of himself on Twitter lying in bed, with the caption, "Me after a successful night of data thugging."

“You have no idea how many people are debating leaving the field because of these thugs.”

Anaya’s view of psychology, and academic research generally, echoes the combustible language at SIPS. "The system is so broken right now — academia, the tenure system, everything just needs to get burned to the ground," he says. "We need to start from scratch."

Nick Brown doesn’t think that the lack of funding or institutional support for data thuggery is necessarily a hindrance. In fact, Brown, a graduate student at the University of Groningen, in the Netherlands, believes that independence is probably crucial. Once the data thugs are forced to answer to some authority, Brown worries, there will be limits on whom they criticize and how they level that criticism. They will become more deferential and less daring.

As it stands, someone will email Brown about a finding that’s not quite right, he will poke around, share it with his fellow data thugs via a near-daily email chain, and then blog about it. It might be a day or two from tip to post. A tongue-in-cheek headline on one of Brown’s recent posts reads: "This researcher compared two identical numbers. The effect size he obtained will shock you!"

Sign up to receive highlights from our magazine of ideas and the arts, delivered once a week.

In his Twitter bio, Brown calls himself a "self-appointed data police cadet." He’s being modest: Data Thug-in-Chief would be more accurate. Brown’s been at the center of seemingly every major psychology dust-up for the past several years. He was deeply involved in ferreting out Wansink’s misdeeds and scrutinizing the claims of Amy Cuddy, a Harvard psychologist, regarding the wonder-working power of standing with your hands on your hips. He collaborated with Alan Sokal, the physicist who in the 1990s tricked a postmodern culture journal into printing gibberish, to expose the fuzzy mathematics behind a supposed ratio for human flourishing. He also translated the memoir/confession of Diederik Stapel, perhaps the most brazen fraudster in the history of social psych, from Dutch into English and posted it free on his website.

When I spoke with Brown at the conference, he was wearing a SIPS T-shirt and a kilt. He’s in his 50s and spent a career in the corporate world, working in IT departments, before retiring to pursue his Ph.D. In the course of our conversation, he compared the psychology establishment to the tobacco industry and to a house of cards "where 90 percent of people are doing things wrong."

And yet, Brown argues, much of the field carries on as if all is well. "It’s like global warming," he says. "It’s a huge crisis, but there’s still food at the stores. If you’re an average hack scientist churning out papers, have you noticed a problem getting published? No. So you think, ‘What’s the problem here?’ "

Not everyone thinks it’s a huge crisis, or even a crisis at all. I spoke with several researchers who complain that the real problem is the replication movement itself. They say a field that once seemed ascendant, its latest findings summarized in magazines and turned into best sellers, is now dominated by backbiting and stifled by fear. Psychologists used to talk about their next clever study; now they fret about whether their findings can withstand withering scrutiny. "You have no idea how many people are debating leaving the field because of these thugs," a tenured psychologist, the same one who calls them "human scum," told me. "They’re making people not believe in science. Why would you want to be in a field that’s so mean?"

That psychologist didn’t want me to use his name, because he’s afraid that the data thugs will come after him. ("They’re vindictive little bastards," he says.) He doesn’t object to researchers who are guilty of fraud getting taken to task. They deserve it. But he says psychologists whose methods fail to live up to newly enshrined standards are being turned into punching bags. The thugs "lump people who falsify data and plagiarize someone else’s work with someone who makes small mistakes," he said. "They act like they’re all the spawn of the devil."

I asked him if he thinks about leaving psychology. "Every day," he replied.

Lisa Feldman Barrett also hears from psychologists who consider giving up. In 2015, Barrett wrote an op-ed for The New York Times with the headline "Psychology Is Not in Crisis."eqabaeqsa Her essay came on the heels of a project, led by Brian Nosek, executive director of the Center for Open Science, that attempted to replicate 100 experiments selected from three respected psychology journals. What that project found was jaw-dropping: Sixty percent of them couldn’t be replicated. However, Barrett, who is a professor of psychology at Northeastern University and president-elect of the Association of Psychological Science, made the case that this was not proof of the sky falling, but rather indicative of "the wonderfully twisty path" that is scientific progress.

I was curious whether now, three years later, she felt the same. She told me that while she’s in favor of "anything that tightens the ship and makes scientists more accountable," she still thinks that the crisis narrative is overblown. "Are there things we can do better? Sure, but that’s always been true," she said. "I don’t think it’s any worse than it used to be. And I don’t think it’s worse than any other science."

Kevin Van Aelst for The Chronicle

As Barrett sees it, some of what the data thugs do "borders on harassment." The prime example is that of Amy Cuddy, whose power-pose study was the basis for a TED talk that’s been viewed more than 48 million times and led to a best-selling book, Presence (Little, Brown & Company, 2015). The 2010 study has failed to replicate, and the first author, Dana Carney, a psychologist at Berkeley, no longer believes in the effect. The power-pose study is held up as an example of psychology at its most frivolous and unreliable. Cuddy, though, has not renounced the research and has likened her treatment to bullying. She recently tweeted: "People who want to destroy often do so with greater passion and energy and time than people who want to build." Some psychologists, including Barrett, see in the ferocity of that criticism an element of sexism. It’s true that the data thugs tend to be, but are not exclusively, male — though if you tick off the names of high-profile social psychologists whose work has been put through the replication ringer, that list has lots of men on it, too. Barrett thinks the tactics of the data thugs aren’t creating an atmosphere for progress in the field. "It’s a hard enough life to be a scientist," she says. "If we want our best and brightest to be scientists, this is not the way to do it."

Richard Nisbett agrees. Nisbett has been a major figure in psychology since the 1970s. He’s co-director of the Culture and Cognition program at the University of Michigan at Ann Arbor, author of books like Mindware: Tools for Smart Thinking (Farrar, Straus, and Giroux, 2015), and a slew of influential studies. Malcolm Gladwell called him "the most influential thinker in my life." Nisbett has been calculating effect sizes since before most of those in the replication movement were born.

And he’s a skeptic of this new generation of skeptics. For starters, Nisbett doesn’t think direct replications are efficient or sensible; instead he favors so-called conceptual replication, which is more or less taking someone else’s interesting result and putting your own spin on it. Too much navel-gazing, according to Nisbett, hampers professional development. "I’m alarmed at younger people wasting time and their careers," he says. He thinks that Nosek’s ballyhooed finding that most psychology experiments didn’t replicate did enormous damage to the reputation of the field, and that its leaders were themselves guilty of methodological problems. And he’s annoyed that it’s led to the belief that social psychology is riddled with errors. How do they know that?, Nisbett asks, dropping in an expletive for emphasis.

Simine Vazire has heard that argument before. Vazire, a professor of psychology at the University of California at Davis, and one of the SIPS organizers, regularly finds herself in meetings where no one shares her sense of urgency about the replication crisis. "They think the status quo is fine, and we can make tweaks," she says. "I’m often the only person in the room who thinks there’s a big problem."

It’s not that the researchers won’t acknowledge the need for improvement. Who’s against progress? But when she pushes them on what that means, the division becomes apparent. They push back on reforms like data transparency (sharing your data freely with other researchers, so they can check your work) or preregistration (saying publicly what you’re trying to discover in your experiment before you try to discover it). That’s not the way it’s normally been done. Psychologists tend to keep their data secret, arguing that it’s proprietary or that revealing it would endanger subjects’ anonymity. But not showing your work makes it easier to fudge what you found. Plus the freedom to alter your hypothesis is what leads to so-called p-hacking, which is shorthand for when a researcher goes searching for patterns in statistical noise.

On her blog, named "sometimes i’m wrong," Vazire posted an oath for scientists that includes pledges to "not suppress evidence against my conclusions" and to "correct my past claims if I learn that they were wrong." There’s also this line: "I will recognize as valuable the work of scientists who aim to correct errors in the scientific record." Vazire has heard from colleagues who think that the replication movement is sucking the joy out of science. She agrees that might be so. But she doesn’t find it a persuasive argument for pretending that everything’s fine. "There’s an assumption that science should be fun and exhilarating, and the things we’re proposing make things less fun," she says. "That’s true. But we probably haven’t been doing enough of eating our vegetables."

Steve Lindsay started eating his vegetables a few years ago. In 2012, Lindsay, a professor of psychology at the University of Victoria and editor of Psychological Science, one of the field’s top journals, heard a presentation on the use of statistics in research. It was a revelation. "I realized I had been running underpowered experiments for my entire career," he says. "I thought ‘Oh! That’s why I get such inconsistent results!’"

He’s now using larger sample sizes, and in his role as editor he’s tried to encourage authors to make their data more easily available. Still, Lindsay talks with psychologists all the time who aren’t eager to embrace the updated rules, and he understands why. "Our literature is packed with unreliable findings," he says. "And I can imagine if you hitched your whole wagon to a concept that doesn’t seem to be a real thing, that could be threatening."

The replication crisis may turn out to be less a footnote in the history of science than a defining era. And it’s a safe assumption that there’s more reckoning to come. "You’re not going to click your fingers and get your revolution," says James Heathers. "It’s going to be weird and slow, and when it’s done, people aren’t going to fully appreciate the fact that it happened."

Like Heathers, Nick Brown sometimes shakes his head at the reluctance among researchers to acknowledge what, to him, seems obvious. To continue to defend a system that’s churned out stacks upon stacks of hopelessly flawed papers, rather than to own up to the truth and try to fix it, seems pointless. "I don’t know whether they genuinely believe they’re doing the right thing or there’s a sort of doubt niggling at the back of their mind, but they don’t want to acknowledge it," Brown says. "Maybe the people who need to make those changes, in that deep, dark moment before they go to sleep, they think to themselves, ‘How are we going to get out of this?’"

Correction (9/12/2018, 6:40 p.m.): Simine Vazire is a professor, not an associate professor, at the University of California at Davis. The text has been corrected accordingly.

Tom Bartlett is a senior writer who covers science and other things. Follow him on Twitter @tebartl.