The New York Times last weekend looked at the controversy around the recent changes to the mammogram guidelines from a mathematical perspective. Compared to the analysis based on Bayes’ Theorem from the Harvard Social Science Statistics blog (which apparently caused some controversy itself: that post was deleted and later replaced after some errors apparently crept into the calculations), this article argues from a simple scenario with made-up (but plausible) numbers:

Assume there is a screening test for a certain cancer that is 95 percent accurate; that is, if someone has the cancer, the test will be positive 95 percent of the time. Let’s also assume that if someone doesn’t have the cancer, the test will be positive just 1 percent of the time. Assume further that 0.5 percent — one out of 200 people — actually have this type of cancer. Now imagine that you’ve taken the test and that your doctor somberly intones that you’ve tested positive. Does this mean you’re likely to have the cancer? Surprisingly, the answer is no.

To see why, let’s suppose 100,000 screenings for this cancer are conducted. Of these, how many are positive? On average, 500 of these 100,000 people (0.5 percent of 100,000) will have cancer, and so, since 95 percent of these 500 people will test positive, we will have, on average, 475 positive tests (.95 x 500). Of the 99,500 people without cancer, 1 percent will test positive for a total of 995 false-positive tests (.01 x 99,500 = 995). Thus of the total of 1,470 positive tests (995 + 475 = 1,470), most of them (995) will be false positives, and so the probability of having this cancer given that you tested positive for it is only 475/1,470, or about 32 percent! This is to be contrasted with the probability that you will test positive given that you have the cancer, which by assumption is 95 percent.

It’s a nice example of how our intuition about probabilities can often be out of step with reality.

New York Times: Mammogram Math

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** Revolutions**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** statistics