Royal Statistical Society Publications


Artificial intelligence researchers have made impressive use of Bayesian inference as a means of approximating some human capabilities, which raises the question: do human judgements and decisions adhere to similar rules? Robert Bain delves into the arguments, both for and against

Main image: argus/Shutterstock

The human brain is made up of 90 billion neurons connected by more than 100 trillion synapses. It has been described as the most complicated thing in the world, but brain scientists say that is wrong: they think it is the most complicated thing in the known universe. Little wonder, then, that scientists have such trouble working out how our brain actually works. Not in a mechanical sense: we know, roughly speaking, how different areas of the brain control different aspects of our bodies and our emotions, and how these distinct regions interact. The questions that are more difficult to answer relate to the complex decision‐making processes each of us experiences: how do we form beliefs, assess evidence, make judgements, and decide on a course of action?

For the clearest evidence of Bayesian reasoning in the brain, we must look past the high‐level cognitive processes

Figuring that out would be a great achievement, in and of itself. But this has practical applications, too, not least for those artificial intelligence (AI) researchers who are looking to transpose the subtlety and adaptability of human thought from biological “wetware” to computing hardware.

In looking to replicate aspects of human cognition, AI researchers have made use of algorithms that learn from data through a process known as Bayesian inference. Based on Bayes’ famous theorem (see box, page 16), Bayesian inference is a method of updating beliefs in the light of new evidence, with the strength of those beliefs captured using probabilities. As such, it differs from frequentist inference, which focuses on how frequently we might expect to observe a given set of events under specific conditions.

Bayesian inference is seen as an optimal way of assessing evidence and judging probability in real‐world situations; “optimal” in the sense that it is ideally rational in the way it integrates different sources of information to arrive at an output. In the field of AI, Bayesian inference has been found to be effective at helping machines approximate some human abilities, such as image recognition. But are there grounds for believing that this is how human thought processes work more generally? Do our beliefs, judgments, and decisions follow the rules of Bayesian inference?

For the clearest evidence of Bayesian reasoning in the brain, we must look past the high‐level cognitive processes that govern how we think and assess evidence, and consider the unconscious processes that control perception and movement.

Professor Daniel Wolpert of the University of Cambridge's neuroscience research centre believes we have our Bayesian brains to thank for allowing us to move our bodies gracefully and efficiently – by making reliable, quick‐fire predictions about the result of every movement we make.

Imagine taking a shot at a basketball hoop. In the context of a Bayesian formula, the “belief” would be what our brains already know about the nature of the world (how gravity works, how balls behave, every shot we have ever taken), while the “evidence” is our sensory input about what is going on right now (whether there is any breeze, the distance to the hoop, how tired we are, the defensive abilities of the opposing players). Wolpert, who has conducted a number of studies on how people control their movements, believes that as we go through life our brains gather statistics for different movement tasks, and combine these in a Bayesian fashion with sensory data, together with estimates of the reliability of that data. “We really are Bayesian inference machines,” he says.

To demonstrate this, Wolpert and his team invited research subjects to their lab to undergo tests based on a cutting‐edge neuroscientific technique: tickling.

Professor Wolpert believes that as we go through life our brains gather statistics for different movement tasks, and combine these in a Bayesian fashion with other data

As every five‐year‐old knows, other people can easily have you in fits of laughter by tickling you, but you cannot tickle yourself. Until now. Wolpert's team used a robot arm to mediate and delay people's self‐tickling movements by fractions of a second – which turned out to be enough to bring the tickly feeling back.

The problem with trying to tickle yourself is that your body is so good at predicting the results of your movement and reacting that it cancels out the effect. But by delaying people's movements, Wolpert was able to mess with the brain's predictions just enough to bring back the element of surprise. This revealed the brain's highly attuned estimates of what will happen when you move your finger in a certain way, which are very similar to what a Bayesian calculation would produce from the same data.1

Bayes’ theorem provides us with the means to calculate the probability of certain events occurring conditional on other events that may occur – expressed as a probability between 0 and 1. Consider the following formula:

This tells us the probability of event B occurring given that event E has happened. This is known as a conditional probability, and it is derived by multiplying the conditional probability of E given B, by the probability of event B, divided by the probability of event E.

The same basic idea can be applied to beliefs. In this context, P(B|E) is interpreted as the strength of belief B given evidence E, and P(B) is our prior level of belief before we came across evidence E.

Using Bayes’ theorem, we can turn our “prior belief” into a “posterior belief”. When new evidence arises, we can repeat the calculation, with our last posterior belief becoming our next prior. The more evidence we assess, the sharper our judgements should get.

A study by Erno Téglás et al., published in Science, looked at one‐year‐old babies and how they learn to predict the movement of objects – measured by how long they looked at the objects as they moved them around in different patterns. The study found that the babies developed “expectations consistent with a Bayesian observer model”, mirroring “rational probabilistic expectations”.2 In other words, the babies learned to expect certain patterns of movement, and their expectations were consistent with a Bayesian analysis of the probability of different patterns occurring.

Other researchers have found indications of Bayesianism in higher‐level cognition. A 2006 study by Tom Griffiths of the University of California, Berkeley, and Josh Tenenbaum of MIT asked people to make predictions of how long people would live, how much money films would make, and how long politicians would last in office. The only data they were given to work with was the running total so far: current age, money made so far, and years served in office to date. People's predictions, the researchers found, were very close to those derived from Bayesian calculations.3

This suggests the brain not only has mastered Bayes’ theorem, but also has finely‐tuned prior beliefs about these real‐life phenomena, based on an understanding of the different distribution patterns of human life‐spans, box office takings, and political tenure. This is one of a number of studies that provide evidence of probabilistic models underlying the way we learn and think.

However, the route the brain takes to reach such apparently Bayesian conclusions is not always obvious. The authors of the study involving the babies said it was “unclear how exactly the workings of our model correspond to the mechanisms of infant cognition”, but that the close fit between the model and the observed data about the babies’ predictions suggests “at least a qualitative similarity between the two”.2

To achieve such Bayesian reasoning, the brain would have to develop some kind of algorithm, manifested in patterns of neural connections, to represent or approximate Bayes’ theorem. Eric‐Jan Wagenmakers, an experimental psychologist at the University of Amsterdam, says the nature of the brain makes it difficult to imagine something as neat and elegant as Bayes’ theorem being reflected in a form we would recognise. We are dealing, he says, with “an unsupervised network of very stupid individual units that is forced by feedback from its environment to mimic a system that is Bayesian”.

Before we accept the Bayesian brain hypothesis wholeheartedly, there are a number of strong counter‐arguments that must be considered. For starters, it is fairly easy to come up with probability puzzles that should yield to Bayesian methods, but that regularly leave many people flummoxed. For instance, many people will tell you that if you toss a series of coins, getting all heads or all tails is less likely than getting, for instance, tails–tails–heads–tails–heads. It is not and Bayes’ theorem shows why: as the coin tosses are independent, there is no reason to expect one sequence is more likely than another.

There is also the well‐known Monty Hall problem, which can fool even trained minds. Participants are asked to pick one of three doors – A, B, or C – behind one of which lies a prize. The host of the game then opens one of the non‐winning doors (say, C), after which the contestant is given the choice of whether to stick with their original door (say, A) or switch to the other unopened door (say, B). Long story short: most people think switching will make no difference, when in fact it improves your chances of winning. Mathematician Keith Devlin has shown how you can use Bayes’ formula to work this out, so that the prior probability that the prize is behind door B, 1 in 3, becomes 2 in 3 when door C is opened to reveal no prize. The details are online (bit.ly/28J37bX) and they are fairly straightforward. Surely a Bayesian brain would be perfectly placed to cope with calculations such as these?

Even Sir David Spiegelhalter, professor of the public understanding of risk at the University of Cambridge, admits to mistrusting his intuition when it comes to probability. “The only gut feeling I have about probability is not to trust my gut feelings, and so whenever someone throws a problem at me I need to excuse myself, sit quietly muttering to myself for a while, and finally return with what may or may not be the right answer,” Spiegelhalter writes in guidance for maths students (nrich.maths.org/7326).

Psychologists have uncovered plenty of examples where our brains fail to weigh up probabilities correctly. The work of Nobel prize winner and bestselling author Daniel Kahneman, among others, has thrown light on the strange quirks of how we think and act – yielding countless examples of biases and mental shortcuts that produce questionable decisions. For instance, we are more likely to notice and believe information if it confirms our existing beliefs. We consistently assign too much weight to the first piece of evidence we encounter on a subject. We overestimate the likelihood of events that are easy to remember – which means the more unusual something is, the more likely our brains think it is to happen.

These weird mental habits are a world away from what you would expect of a Bayesian brain. At the very least, they suggest that our “prior beliefs” are hopelessly skewed.

“There's considerable evidence that most people are dismally non‐Bayesian when performing reasoning,” says Robert Matthews of Aston University, Birmingham, and author of Chancing It, about the challenges of probabilistic reasoning. “For example, people typically ignore base‐rate effects and overlook the need to know both false positive and false negative rates when assessing predictive or diagnostic tests.”

In the case of testing for diseases, Bayes’ theorem reveals how even a test that is said to be 99% accurate might be wrong half of the time when telling people they have a rare condition – because its prevalence is so low that even this supposedly high accuracy figure still leads to just as many false positives as true positives. It is a counter‐intuitive result that would elude most people not equipped with a calculator and a working knowledge of Bayes’ theorem (see box, page 19).

Life is full of really hard problems, which our brains must try and solve in a state of uncertainty and constant change

As for base rates, Bayes’ theorem tells us to take these into account as “prior beliefs” – a crucial step that our mortal brains routinely overlook when judging probabilities on the fly.

All in all, that is quite a bit of evidence in favour of the argument that our brains are non‐Bayesian. But do not forget that we are dealing with the most complicated thing in the known universe, and these fascinating quirks and imperfections do not give a complete picture of how we think.

Eric Mandelbaum, a philosopher and cognitive scientist at the City University of New York's Baruch College, says this kind of irrationality “is most striking because it arises against a backdrop of our extreme competence. For every heuristics‐and‐biases study that shows that we, for instance, cannot update base rates correctly, one can find instances where people do update correctly.”

Tom Griffiths recently co‐wrote the book Algorithms to Live By with Brian Christian. The authors agree that humans need to give themselves more credit for how well they make decisions. They write: “Over the past decade or two, behavioural economics has told a very particular story about human beings: that we are irrational and error‐prone, owing in large part to the buggy, idiosyncratic hardware of the brain. This self‐deprecating story has become increasingly familiar, but certain questions remain vexing. Why are four‐year‐olds, for instance, still better than million‐dollar supercomputers at a host of cognitive tasks, including vision, language, and causal reasoning?”4

So while our well‐documented flaws may shed light on the limits of our capacity for probabilistic analysis, we should not write off the brain's statistical abilities just yet. Perhaps what our failings really reveal, Griffiths and Christian suggest, is that life is full of really hard problems, which our brains must try and solve in a state of uncertainty and constant change, with scant information and no time.

Bayesian models of cognition remain a hotly debated area. Critics complain of too much post hoc rationalisation, with researchers tweaking their models, priors, and assumptions to make almost any results fit a probabilistic interpretation. They warn of the Bayesian brain theory becoming a one‐size‐fits‐all explanation for human cognition.

In a review of past studies on Bayesian cognition, Gary Marcus and Ernest Davis of New York University say: “If the probabilistic approach is to make a lasting contribution to researchers’ understanding of the mind, beyond merely flagging the obvious facts that people are sensitive to probabilities and adjust their beliefs (sometimes) in light of evidence, its practitioners must face apparently conflicting data with considerably more rigour. They must also reach a consensus on how models will be chosen, and stick to that consensus consistently.”5

Mandelbaum also has doubts about how far the Bayesian brain theory can go when it comes to higher‐level cognition. For a theory that purports to deal with updating beliefs, it struggles to explain the vagaries of belief acquisition, for instance. “Good objections to our views leave us unmoved,” says Mandelbaum. “Excellent objections cause us to be even more convinced of our views. It is the rare egoless soul who finds arguments persuasive for domains they care deeply about. It is somewhat ironic that the area for which Bayesianism seems most well suited – belief updating – is the area where Bayesianism has the most problems.”

Stepping back from studies of the brain and looking at the bigger picture, some would take the fact that humanity is still here as evidence that something Bayesian is going on in our heads. After all, Bayesian decision‐making is essentially about combining existing knowledge and new evidence in an optimal fashion. Or, as the renowned French mathematician Pierre‐Simon Laplace put it, it is “common sense expressed in numbers”. Wagenmakers says: “It's difficult to believe that any organism isn't at least approximately Bayesian in its environment, given its limits. Even if people use a heuristic that's not Bayesian, you could argue it's Bayesian because of the utility argument: the effort you have to put in is easy. Many decisions need to be fast decisions, so you have to make shortcuts sometimes. The shortcut itself is sub‐optimal and non‐Bayesian – but the fact you're taking the shortcut every time means huge savings in effort and energy. This may boil down to a more successful organism that's more Bayesian than its competitor.”

On the other hand, having a Bayesian brain can still lead us into trouble. In a 2005 article in Significance, Robert Matthews explained how Bayesian inference is powerless in the face of strongly held irrational beliefs such as conspiracy theories and psychiatric delusions – because people's trust in new evidence is so low.6 In this way, the mechanism by which a Bayesian brain updates beliefs can become corrupted, leading to false beliefs becoming reinforced by fresh evidence, rather than corrected.

To be fair, not even the most fervent supporter of the Bayesian brain hypothesis would claim that the brain is a perfect Bayesian inference engine. But we could reasonably describe a brain as “Bayesian” if it were “approximately optimal given its fundamental limitations,” says Wagenmakers. “For example, people do not have infinite memory or infinite attentional capacity.”

Another crucial caveat, Wagenmakers says, is that the Bayesian aspects of our brains are rooted in the environment in which we operate. “It's approximately Bayesian given the hardware limitations and given that we operate in this particular world. In a different world we might fail completely. We would not be able to adjust.”

In their book, Christian and Griffiths argue that much of what the brain does is about making trade‐offs in difficult situations – rather than aiming for perfection, the challenge becomes “how to manage finite space, finite time, limited attention, unknown unknowns, incomplete information, and an unforeseeable future; how to do so with grace and confidence; and how to do so in a community with others who are all simultaneously trying to do the same.”4

Clearly the Bayesian brain debate is far from settled. Some things our brains do seem to be Bayesian, others are miles off the mark. And there is much more to learn about what is really happening behind the scenes.

In any case, if our Bayesian powers are limited by the size of our brains, the time available, and all the other constraints of the environment in which we live (which differ from person to person and from moment to moment), then judging whether what is going on is Bayesian quickly becomes impossible, says Wagenmakers. “At some point it becomes a philosophical discussion whether you would call that Bayesian or not.”

Even for those who are not convinced by the Bayesian brain hypothesis, the concept is proving a useful starting point for research. Mandelbaum says: “At this point, it's more an idea guiding a research programme than a specific hypothesis about how cognition works.”

Bayesian brain theories are used as part of rational analysis, which involves developing models of cognition based on a starting assumption of rationality, seeing whether they work, then reviewing them. Tom Griffiths says: “It turns out using this approach for making models of cognition works quite well. Even though there are ways that we deviate from Bayesian inference, the basic ideas behind it are right.”

So the idea of the Bayesian brain is a gift to researchers because it “gives us a tool for understanding what the brain should be doing,” says Griffiths. And for a lot of what the brain does, Bayesian inference remains “the best game in town in terms of a characterisation of what that ideal solution looks like”.

References
  • 1Blakemore, S. J., Wolpert, D. and Frith, C. (2000) Why can't you tickle yourself? NeuroReport, 11 (11), R11–16.
  • 2Téglás, E., Vul, E., Girotto, V., Gonzalez, M., Tenenbaum, J. B. and Bonatti, L. L. (2001) Pure reasoning in 12‐month‐old infants as probabilistic inference. Science, 332(6033), 1054–1059.
  • 3Griffiths, T. L. and Tenenbaum, J. B. (2006) Optimal predictions in everyday cognition. Psychological Science, 17(9), 767–773.
  • 4Christian, B. and Griffiths, T. (2016) Algorithms to Live By. London: William Collins.
  • 5Marcus, G. F. and Davis, E. (2013) How robust are probabilistic models of higher‐level cognition? Psychological Science, 24(12), 2351–2360.
  • 6Matthews, R. (2005) Why do people believe weird things? Significance, 2(4), 182–185.

How is it that a diagnostic test that claims to be 99% accurate can still give a wrong diagnosis 50% of the time? In testing for a rare condition, we scan 10 000 people. Only 1% (100 people) have the condition; 9900 do not. Of the 100 people who do have the disease, a 99% accurate test will detect 99 of the true cases, leaving one false negative. But a 99% accurate test will also produce false positives at the rate of 1%. So, of the 9900 people who do not have the condition, 1% (99 people) will be told erroneously that they do have it. The total number of positive tests is therefore 198, of which only half are genuine. Thus the probability that a positive test result from this “99% accurate” test is a true positive is only 50%.