"The Book of Why" by Pearl and Mackenzie - Statistical Modeling, Causal Inference, and Social Science

By Andrew

Judea Pearl and Dana Mackenzie sent me a copy of their new book, “The book of why: The new science of cause and effect.”

There are some things I don’t like about their book, and I’ll get to that, but I want to start with a central point of theirs with which I agree strongly.

Division of labor

A point that Pearl and Mackenzie make several times, even if not quite in this language, is that there’s a division of labor between qualitative and quantitative modeling.

The models in their book are qualitative, all about the directions of causal arrows. Setting aside any problems I have with such models (I don’t actually think the “do operator” makes sense as a general construct, for reasons we’ve discussed in various places on this blog from time to time), the point is that these are qualitative, on/off statements. They’re “if-then” statements, not “how much” statements.

Statistical inference and machine learning focuses on the quantitative: we model the relationship between measurements and the underlying constructs being measured; we model the relationships between different quantitative variables; we have time-series and spatial models; we model the causal effects of treatments and we model treatment interactions; and we model variation in all these things.

Both the qualitative and the quantitative are necessary, and I agree with Pearl and Mackenzie that typical presentations of statistics, econometrics, etc., can focus way too strongly on the quantitative without thinking at all seriously about the qualitative aspects of the problem. It’s usually all about how to get the answer given the assumptions, and not enough about where the assumptions come from. And even when statisticians write about assumptions, they tend to focus on the most technical and least important ones, for example in regression focusing on the relatively unimportant distribution of the error term rather than the much more important concerns of validity and additivity.

If all you do is set up probability models, without thinking seriously about their connections to reality, then you’ll be missing a lot, and indeed you can make major errors in casual reasoning, as James Heckman, Donald Rubin, Judea Pearl, and many others have pointed out. And indeed Heckman, Rubin, and Pearl have (each in their own way) advocated for substantive models, going beyond data description to latch on to underlying structures of interest.

Pearl and Mackenzie’s book is pretty much all about qualitative models; statistics textbooks such as my own have a bit on qualitative models but focus on the quantitative nuts and bolts. We need both.

Judea Pearl, like Jennifer Hill and Frank Sinatra, are right that “you can’t have one without the other”: If you think you’re working with a purely qualitative model, it turns out that, no, you’re actually making lots of data-based quantitative decisions about which effects and interactions you decide are real and which ones you decide are not there. And if you think you’re working with a purely quantitative model, no, you’re really making lots of assumptions (causal or otherwise) about how your data connect to reality.

The Book of Why

Pearl and Mackenzie’s book is really three books woven together:

1. An exposition of Pearl’s approach to causal inference based on graphs and the do-operator.

2. An intellectual history of this and other statistical approaches to causal inference.

3. A series of examples including some interesting discussions of smoking and cancer, going far beyond what you’ll generally see in a popular book or a textbook on statistics or causal inference.

About the exposition of causal inference, I have little to say. As regular readers of this blog know, I have difficulty understanding the point of Pearl’s writing on causal inference (see, for example, here). Just as Pearl finds it baffling that statisticians keep taking causal problems and talking about them in the language of statistical models, I find it baffling that Pearl and his colleagues keep taking statistical problems and, to my mind, complicating them by wrapping them in a causal structure (see, for example, here).

I’m not saying that I’m right and Pearl is wrong here—lots of thoughtful people find Pearl’s ideas valuable, and I accept that, for many people, Pearl’s approach is a good way—perhaps the best way—to frame causal inference. I’m just saying that I don’t really have anything more to say on the topic.

About the intellectual history of causal inference: this is interesting. I disagree with a lot of what Pearl says, but I guess that’s kinda the point, as Pearl is fighting against the statistics establishment, which I’m part of. For example, there’s this from the promotional material that came with the book:

Using a calculus of cause and effect developed by Pearl and others, scientists now have the ability to answer such questions as whether a drug cured an illness, when discrimination is to blame for disparate outcomes, and how much worse global warming can make a heat wave.

Ummm, I’m pretty sure that scientists could do all these without the help of Pearl! Actually, for that last one, I think the physicists don’t really need statisticians at all.

On page 66 of the book, Pearl and Mackenzie write that statistics “became a model-blind data reduction enterprise.” Hey! What the hell are you talking about?? I’m a statistician, I’ve been doing statistics for 30 years, working in areas ranging from politics to toxicology. “Model-blind data reduction”? That’s just bullshit. We use models all the time. There are some statisticians who avoid models, or at least there used to be—I used to work in the same department with some of them—but that’s really a minority view within the field. Statisticians use models all the time, statistics textbooks are full of models, and so forth.

The book has a lot more examples along these lines, and I’ll append them to the end of this post.

I think the key line in the Pearl and Mackenzie book comes on page 90, where they write, “Linguistic barriers are not surmounted so easily.” My many long and frustrating exchanges with Pearl have made me realize how difficult it is to have a conversation when you can’t even agree on the topic to be discussed!

Although I was bothered by a lot of Pearl and Mackenzie’s offhand remarks, I could well imagine that their book could be valuable to outsiders who want to get a general picture of causal reasoning and its importance. In some sense I’m too close to the details. The big picture, if you set aside disputes about who did what when, is important, no matter what notation or language is used to frame it.

And that brings me to the examples in the book. These are great. I find some of the reasoning hard to follow, and Pearl and Mackenzie’s style is different from mine, but that’s fine. The examples are interesting and they engage the reader—at least, they engage me—and I think they are a big part of what makes the book work.

Mischaracterizations of statistics and statisticians

As noted above, Pearl and Mackenzie have a habit of putting down statisticians in a way that seems to reflect ignorance of our field.

On page 356, they write, “Instead of seeing the difference between populations as a threat to the ‘external validity’ of a study, we now have a methodology for establishing validity in situations that would have appeared hopeless before.” No, they would not have appeared hopeless before, at least not to statisticians who knew about regression models with interactions, poststratification, and multilevel models.

Again, on page 357: “the culture of ‘external validity’ is totally preoccupied with listing and categorizing the threats to validity rather than fighting them.” No. You could start with Jennifer’s 2011 paper, for example.

On page 371, they thank Chris Winship, Steven Morgan, and Felix Elwert “for ushering social science into the age of causation.” Look. I think Winship, Morgan, and Elwert are great. I positively reviewed Morgan and Winship’s book. But even they wouldn’t say “they ushered social science into the age of causation.” Social scientists have been doing good work on causation long before Winship, Morgan, and Elwert came along. By writing the above sentence, Pearl and Mackenzie are just gratuitously insulting all the social scientists who came before these people. It’s kind of like when Kingsley Amis sang the praises of Ian Fleming and Dick Francis: that was his poke in the eye to all the other authors out there.

I don’t know which lines of the book were written by Pearl, which by Mackenzie, and which by both. In any case, I find it unfortunate that they feel the need to keep putting down statisticians and social scientists. If they were accurate in their putdowns, I’d have no problem. But that’s not what’s happening here. Kevin Gray makes a similar point here, from the perspective of a non-academic statistician.

Look. I know about the pluralist’s dilemma. On one hand, Pearl believes that his methods are better than everything that came before. Fine. For him, and for many others, they are the best tools out there for studying causal inference. At the same time, as a pluralist, or a student of scientific history, we realize that there are many ways to bake a cake. It’s challenging to show respect to approaches that you don’t really work for you, and at some point the only way to do it is to step back and realize that real people use these methods to solve real problems. For example, I think making decisions using p-values is a terrible and logically incoherent idea that’s led to lots of scientific disasters; at the same time, many scientists do manage to use p-values as tools for learning. I recognize that. Similarly, I’d recommend that Pearl recognize that the apparatus of statistics, hierarchical regression modeling, interactions, poststratification, machine learning, etc etc., solves real problems in causal inference. Our methods, like Pearl’s, can also mess up—GIGO!—and maybe Pearl’s right that we’d all be better off to switch to his approach. But I don’t think it’s helping when he gives out inaccurate statements about what we do.

P.S. I also noticed a couple of what seem to be technical errors. No big deal, we all make mistakes, and there’s plenty of time to correct them for the second edition.

– Figure 2.3 of the book reproduces Galton’s classic regression of adult children’s heights on the average of parents’ heights. But even though the graph is clearly labeled “mid-parents,” in the caption and the text, Pearl and Mackenzie refer to it as “father’s height.”

Here’s one lesson you can learn from statisticians: Look at the data. (I’m also suspicious of Figure 2.2, as the correlation it shows between fathers’ and sons’ heights looks too large to me. But I could be wrong here; I guess it would be good to know where the data for that graph came from.)

This is not a big deal, but it shows a lack of care. Someone had to write that passage, and there’s no way to get it wrong if you read Galton’s graph carefully. What’s the point of reproducing the graph in your book if you don’t even look at it?

– From the caption of Figure 4.3: “R. A. Fisher with one of his many innovations: a Latin square . . . Such designs are still used in practice, but Fisher would lager argue convincingly that a randomized design is even more effective.”

Huh? A Latin square design is randomized. It’s a restricted randomization. Just read any textbook on experimental design. Or maybe I’m missing something here?

P.S. More from Pearl here.