A modern introduction to probability and statistics [book review]

xi'an

17 hours ago

[This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the plane to Bengaluru, I read through the book A modern introduction to probability and statistics, by Graham Upton—whose Measuring Animal Abundance I reviewed for CHANCE a while ago—, which is based on the earlier Understanding Statistics, written jointly with Ian Cook. (Not to be confused with A modern introduction to probability and statistics by Dekking et al.) The subtitle is understanding statistical principles in the computer age. Sorry, in the age of the computer. While the cover is most pleasant (and modern), as noticed by an AF flight attendant, the contents are very very standard and could have been written decades ago since the main concession to “the” computer age is the inclusion of a few R commands at the end of most chapters. There are even a few distribution tables here and there (in case “the” computer is not available). But there is no other connection with computational statistics or statistical computing.

The classicism of the contents and the intended audience mean there is little therein on which to either object or criticise. The mixture of elementary probability and basic statistics in a single textbook always feels awkward to me and I think I would have trouble teaching solely from this material. Apart from the glaring typo on the variance of the sum of two correlated random variables on page 87, missing the factor 2 in front of the covariance, while correct(ed) p97 (and the inevitable “the the” typo spotted once). My main criticisms are on the potential confusion between samples and populations in the early chapters, when some statistics are used as motivational examples, as for instance in a (hidden) Monte Carlo stabilisation to the limiting values (p57), way before the Law of Large Numbers is introduced,, the variable mileage in mathematical rigour (while being uncertain that first year students can handle integrals and derivatives), the textbook examples, and the amount of the book contents spent on descriptive statistics and even more on the “classical” tests, with no critical perspective on using point nulls or p-values. The book concludes with a four page (benevolent) chapter on Bayesian statistics that is superfluous imho, or even counterproductive since my experience with a rushed introduction to Bayesian principles almost always result in a rejection of said principles. Plus, the illustration with the coin tossing is not particularly helpful since Andrew maintains that one can load a die, but cannot bias a coin. (A similar reservation on the half-page 289 coverage on pseudo-random generation and Monte Carlo principles for computing p-values.)

Minor (mostly idiosyncratic) remarks follow: CLT prior to LLN, n-1 in sample sd, little to no model criticism (ntbcf goodness of fit), missing an opportunity when mentioning the varying probability of a day being a birthday (p31) in contrast with BDA cover story, and another opportunity to cite the 2024 Ig Nobel Prize for coin tossing around the LLN, an unclear definition for random variables( p53) and a potentially confusing introduction of Poisson distributions through a informal reference to Poisson processes (and no reason why the years of accession of the kings of Sussex and England till Guillaume—making a return on p178 with the Domesday Book—in 1066 should follow such a process as suggested in Figure 3.5), a surprising definition of the constant e as the special case of exp(x) when x=1 and its series expansion (p70), omitting proofs on laws of sums of iid rv’s by introducing moment generating functions rather late, another obscure reference to a 16th German treatise on surveying as a precursor of the CLT (p131), a proof for the normalising constant of the Normal density that will most likely escape most first year students, a introduction of the t, F, and χ² distributions with no mention of their respective densities (pp141-147), never defining a joint Normal distribution density, insisting on unbiasedness without noting that maximum likelihood—with a strange motivation that it “makes the next sample of n observations most likely to resemble the data in the current sample (p228)—estimators are almost always biased, an abundance of footnotes that may prove of little interest for the youngest readers.

[Disclaimer about potential self-plagiarism as usual: this post or an edited version will eventually appear in my Books Review section in CHANCE.]

To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Related