principles of uncertainty

Posted on October 13, 2011 by xi'an in R bloggers | 0 Comments

[This article was first published on Xi'an's Og » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

“Bayes Theorem is a simple consequence of the axioms of probability, and is therefore accepted by all as valid. However, some who challenge the use of personal probability reject certain applications of Bayes Theorem.“ J. Kadane, p.44

Principles of uncertainty by Joseph (“Jay”) Kadane (Carnegie Mellon University, Pittsburgh) is a profound and mesmerising book on the foundations and principles of subjectivist or behaviouristic Bayesian analysis. Jay Kadane wrote Principles of uncertainty over a period of several years and, more or less in his own words, it represents the legacy he wants to leave for the future. The book starts with a large section on Jay’s definition of a probability model, with rigorous mathematical derivations all the way to Lebesgue measure (or more exactly the McShane-Stieltjes measure). This section contains many side derivations that pertain to mathematical analysis, in order to explain the subtleties of infinite countable and uncountable sets, and the distinction between finitely additive and countably additive (probability) measures. Unsurprisingly, the role of utility is emphasized in this book that keeps stressing the personalistic entry to Bayesian statistics. Principles of uncertainty also contains a formal development on the validity of Markov chain Monte Carlo methods that is superb and missing in most equivalent textbooks. Overall, the book is a pleasure to read. And highly recommended for teaching as it can be used at many different levels.

“My desire to avoid the phrase “it can be shown that” has led me to display more of the mathematical underpinnings of the subject than necessary.” J. Kadane, p.xxv

Indeed Principles of uncertainty is (almost) self-contained from a mathematical point of view. Probability is defined from a betting perspective (no stabilisation of frequencies à la von Mises!). Limits, series, uncountable sets, Riemann integrals (whose simultaneous use with and without integration domain confused me for a while), Stieltjes integrals, Fatou’s lemma, Lebesgue’s dominated convergence theorem, matrix algebra, Euler’s formula, the Borel-Kolmogorov paradox, Taylor expansions (I dislike the use of HOT for “higher order terms” in math formulas!), Laplace’s approximation, the Weierstrass approximation, all are covered in reasonable details within the 500 of the book. (I am not sure I agree with the discussion about the uniform distribution on the integers, in Section 3.2!) All standard distributions are covered and justified (incl. the Wishart distribution). Paradoxes like Simpson’s, Monty Hall‘s, the Gambler’s Ruin, Allais‘, the Prisoner dilemma, are processed in specific sections. As written above, the processing of the convergence of MCMC algorithms is quite nice and rather unique: the argument is based on a minorisation constraint (existence of a small set) and the use of the corresponding renewal process of Nummelin (1984), which, in my opinion, is a beautiful way of explaining most convergence properties of Markov chains. While the R code sprinkled along the book may appear superficial, I think it relates to the same goal of Jay Kadane to keep no step unjustified and hence to back graphs with the corresponding R code. The style is as personalistic as the message and very enjoyable, with little stories at the entry of some chapters to make a point. As I read the book within a few days surrounding my trip to Zürich, I cannot be certain I did not miss typos, but I saw very few. (A change of line within the first displayed formula of page 87 is rather surprising and, I think, unintentional. Some pages like p.215, p.230 or p. 324-326 also end up with several highly isolated formulas because of long displayed equations the page after. A Sigma instead of a \sum p. 302. Huge spaces after “=” signs on p. 322 and pp. 363-364. Nothing L_AT_EX cannot not fix.)

“A hierarchical model divides the parameters into groups that permit the imposition of assumptions of conditional independence.” J. Kadane, p.337

The hierarchical chapter of Principles of uncertainty is also well-done, with connections to the James-Stein phenomenon. And an inclusion of the famous New Jersey turnpike lawsuit. In the model choice section (pp.343-344), Jay Kadane comes the closest to defining a Bayesian test, even though he does not call it this way. He will only return to tests in the final chapter (see below). The MCMC chapter that comes right after, while being highly enjoyable on the theoretical side, is missing an illustration of MCMC implementation and convergence (or lack thereof).

“A claim of possession of the objective truth has been a familiar rhetorical move of elites, social, religious, scientific, or economic. Such a claim is useful to intimidate those who might doubt, challenge, or debate the “objective” conclusions reached. History is replete with the unfortunate consequences, nay disasters, that have ensued. To assert the possession of an objective method of analyzing data is to make a claim of extraordinary power in our society. Of course it is annoyingly arrogant, but, much worse, it has no basis in the theory it purports to implement.” J. Kadane, p.446

The above quote is the concluding sentence of the one but [very short] last chapter… It reflects the opinion of the author in such a belligerent way that I fear this chapter does not belong to a general audience book: the “Exploration of Old Ideas” chapter in Principles of uncertainty is too antagonistic to be understandable by neophytes, sounding rather like a backyard fight between small gangs with esoteric names and goals… I of course object to the few lines dismissing “objective” Bayes and Jeffreys‘ heritage as ungrounded, but also to the too hasty processing of Fisher’s and Neyman-Pearson’s “flavors of testing”. (The unfamiliarity of Jay with frequentist testing shows in the choice of the bound 2.36 for the one sided-test at level 0.05 p.439, since it should be either 1.68 at level 0.05 or 2.36 at level .01…) While being unhappy with this chapter, I obviously consider it as a very minor scratch on an otherwise superb and uniquely coherent book. (Esp. when considering the very final sentence “It is precisely to explain the reasons why I find certain methodologies appropriate, and others less so, that I undertook to write this book” (p.448).) A must-read for sure!