Innumeracy, Statistics and R

March 1, 2016

(This article was first published on Mad (Data) Scientist, and kindly contributed to R-bloggers)

A couple of years ago, when an NPR journalist was interviewing me, the conversation turned to quantitative matters. The reporter said, only half jokingly, “We journalists are innumerate and proud.” 🙂

Some times it shows, badly. This morning a radio reporter stated, “Hillary Clinton beat Bernie Sanders among South Carolina African-Americans by an almost 9-to-1 ratio.” Actually, that vote was 86% to 14%, just above 6-to-1, not 9-to-1. A very troubling outcome for Bernie, to be sure, but an even more troubling error in quantitative reasoning by someone in the Fourth Estate who should know better.

One of my favorite quotes is from Chen Lixin, an engineering professor at Northwestern Polytechnic University in Xian, who has warned that China produces students who can’t think independently or creatively, and have trouble solving practical problems. He wrote in 1999 that the Chinese education system “results in the phenomenon of high scores and low ability.” I must warn that we in the U.S. are moving in that same disastrous direction.

But I would warn even more urgently that the solution is NOT to make math education “more practical,” as proposed recently by noted political scientist Andrew Hacker. His solution is that, instead of requiring Algebra II of high school kids, we should allow them to substitute — you know what’s coming, don’t you? — statistics.  Ah, yes. Well, I disagree.

Hacker’s rationale is explained by the article I’ve linked to above:

Most CUNY students come from low-income families, and a 2009 faculty report found that 57 percent fail the system’s required algebra course. A subsequent study showed that when students were allowed to take a statistics class instead, only 44 percent failed.

Most statistics courses are taught, sad to say, in a formula-plugging manner.  So, aside from the dubious, not to mention insulting, attitude that the above passage sends about students from the lower class, it’s just plain wrong in terms of the putative goal of achieving numeracy. In my teaching experience, having students take so-called “practical” courses will not avoid producing innumerate people who come up with things like the pathetic “9-to-1” statistic I referred to earlier in this article.

What does achieve the numeracy goal much better, in my opinion, is intensive hands-on experience, and current high-school statistics courses are NOT taught in that manner at all. They use handheld calculators, which are quite expensive — somehow that doesn’t seem to bother those who are otherwise concerned about students from financially strapped families — and which are pedagogical disasters.

My solution has been to use R as the computational vehicle in statistics courses, including at the high school level. Our real goal is to develop in kids an intuitive feel for numbers, how they work, what they are useful for and so on. Most current stat courses fail to do that, and as we know, actually dull the senses. We should have the students actively explore data sets, both with formal statistical analyses and with graphical description.

Both in no way should it be “easy.” It should challenge the students, get them to think, in fact to THINK HARD. I strongly disagree with the notion that some kids are “incapable” of this, though of course it is easier to achieve with kids from stronger backgrounds.

I agree with the spouse of the author of the article, whose point is that Algebra II — and even more so, Geometry, if properly taught — develops analytical abilities in students. Isn’t that the whole point?

Finally, the formal aspects — the classical statistical inference procedures — DO matter. Data rummaging with R is great, but it should not replace formal concepts such as sampling, confidence intervals and so on. I was quite troubled by this statement by a professor who seems otherwise to be doing great things with R:

Creating a student who is capable of performing coherent statistical analysis in a single semester course is challenging. We [in the profession of teaching statistics[ spend a fair amount of time discussing topics that may not be as useful as they once were (e.g., t-tests, inference for a single proportion, chi-squared tests) and not enough time building skills students are likely to use in their future research (e.g., a deeper understanding of regression, logistic regression, data visualization, and data wrangling skills).

It does NOT have to be either/or.

The innumeracy problem is quite pressing. We might even say we are in a crisis. But let’s take care to find solutions that really do solve the problem.







To leave a comment for the author, please follow the link and comment on their blog: Mad (Data) Scientist. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)