Posted on July 21, 2016 Time to read: 2 minutes
In his bestselling book, Fooled by Randomness (which was named one of the smartest books of all time by Fortune magazine), Nassim Nicholas Taleb discusses a question that was posed to a group of medical doctors.
A test of a disease presents a rate of 5% false positives. The disease strikes 1/1000 of the population. People are tested at random, regardless of whether they are suspected of having the disease. A patient’s test is positive. What is the probability of the patient being stricken with the disease?
Most doctors answered 95%. Less than one in 5 professionals answered the question correctly.
In order to solve Taleb’s question, I will break this problem down into steps using tables. These tables are helpful in understanding a range of similar probability questions.
In solving questions like this, it helps to draw a 2×2 table.
The 2×2 table will be one that is used in appraisal of diagnostic tests.
Key points in the 2×2 table are:
- True disease is always on the top
- Test is on the left
- If comparing with a gold standard test, then gold standard is always on the top
WHAT DOES THE BOX LOOK LIKE FOR THE PROBLEM?
We will now try to fill in the boxes:
- Total number of people with the disease is 1
- 999 people do not have the disease
- Therefore, True Positive = 1 and False Negative = 0
- False positives = 0.05×999 = 49.95 = 50 (5% false positives)
- TN = 1-FP = 949
The box now looks like this:
DETERMINING THE PROBABILITY
In order to determine the probability of the patient having the disease, we need to calculate the positive predictive value (PPV) which is different from sensitivity or specificity.
PPV = TP/ TP+FP = 1/51 = 2%
Therefore, there is a 2% probability that the patient has a disease.
TALEB'S DESCRIPTION OF SOLUTION
I will simplify the answer using the frequency approach. Assume no false negatives. Consider that out of 1000 patients who are administered the test, one will be expected to be afflicted with the disease.
Out of a population of the remaining 999 healthy patients, the test will identify about 50 with the disease (it is 95%) accurate.
The correct answer should be that the probability of being afflicted with disease for someone selected at random who presented a positive test is the following ratio: Number of afflicted persons / number of true positives and false positives. Here, 1 in 51 (2%).
Why is the probability so low, even though the test has only 5% false positives?
The answer is in the low incidence rate of the disease. In rare outcomes, the predictive value of any instrument drops.
Think about applying this principle for suicide prediction where the rate of suicide is even lower. I will highlight the challenges of suicide prediction in another post.
TAKE HOME MESSAGES
- Sensitivity and specificity are functions of the test and do not change with prevalence, for example the test in Taleb’s question had a specificity (true negative rate) of 95%
- Positive and negative predictive values change with prevalence
- PPV and NPV are more important for the patient, as they tell the patient the probability of having or not having a disease
Fooled by randomness
Taleb, N. (2005). Fooled by randomness: The hidden role of chance in life and in the markets (Vol. 1). Random House Incorporated.