A misleading title…

Posted on September 4, 2011 by xi'an in R bloggers | 0 Comments

[This article was first published on Xi'an's Og » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When I received this book, Handbook of fitting statistical distributions with R, by Z. Karian and E.J. Dudewicz, from/for the Short Book Reviews section of the International Statistical Review, I was obviously impressed by its size (around 1700 pages and 3 kilos…). From briefly glancing at the table of contents, and the list of standard distributions appearing as subsections of the first chapters, I thought that the authors were covering different estimation/fitting techniques for most of the standard distributions. After taking a closer look at the book, I think the cover is misleading in several aspects: this is not a handbook (a.k.a. a reference book), it does not cover standard statistical distributions, the R input is marginal, and the authors only wrote part of the book, since about half of the chapters are written by other authors…

“The system we develop in this book has its origins in the one-parameter lambda distribution proposed by John Tukey.” Z.A. Karian & E.J. Dudewicz, p.3, Handbook of fitting statistical distributions with R

So I am glad I left Handbook of fitting statistical distributions with R in my office rather than dragging it along across the Caribbean! First, the book indeed does not aim at fitting standard distributions but instead at promoting a class of quantile distributions, the generalised lambda distributions (GLDs), whose quantile function is a location-scale transform of

$Q(y|\lambda_3,\lambda_4)=F_X^{-1}(y)=y^{\lambda_3}-(1-y)^{\lambda_4}$

(under the constraint on the parameters that the above function of y is non-decreasing) and that the authors have been advocating for a long while. There is nothing wrong per se with those quantile distributions, but neither is there a particular reason to prefer them over the standard parametric distributions! Overall, I am quite wary of one-fits-all distributions, especially when they only depend on four parameters and mix finite with infinite support distributions. The lack of natural motivations for the above is enough to make fitting with those distributions not particularly compelling. Karian and Dudewicz spend an awful lot of space on numerical experiments backing their argument that the generalised lambda distributions approximate reasonably well (in the L₁ and L₂ norms, as it does not work for stricter norms) “all standard” distributions, but it does not explain why the substitution would be of such capital interest. Furthermore, the estimation of the parameters (i.e. the fitting in fitting statistical distributions) is not straightforward. While the book presents the density of the generalised lambda distributions as available in closed form (Theorem 1.2.2), namely (omitting the location-scale parameters),

$f(x|\lambda_3,\lambda_4)=\dfrac{1}{\lambda_3F_X(x|\lambda_3,\lambda_4)^{\lambda_3-1}+\lambda_4\{1-F_X(x|\lambda_3,\lambda_4)\}^{\lambda_4-1}},$

it fails to point out that the cdf

$F_X(x|\lambda_3,\lambda_4)=Q^{-1}(x|\lambda_3,\lambda_4)$

itself is not available in closed form. Therefore, neither likelihood estimation nor Bayesian inference seem easily implementable for those distributions. (Actually, a mention is made of maximum likelihood estimators for the first four empirical moments in the second chapter, but it is alas mistaken.) [Obviously, given that quantile distributions are easy to simulate, ABC would be a manageable tool for handling Bayesian inference on GLDs…] The book focus instead on moment and percentile estimators as the central estimation tool, with no clear message on which side to prefer (see, e.g., Section 5.5).

A chapter (by Su) covers the case of mixtures of GLDs, whose appeal is similarly lost on me. My major issue with using such distributions in mixture setting is that some components may have a finite support, which makes the use of score equations awkward and of Kullback-Leibler divergences to normal mixtures fraught with danger (since those divergence may then be infinite). The estimation method switches to maximum likelihood estimation, as presumably the moment method gets too ungainly. However, I fail to see how maximum likelihood is implemented: I checked the original paper by Su (2007), documenting the related GLDEX R function, but the approach is very approximate in that the true percentiles are replaced with pluggin (and fixed, i.e. non-iterative) values (again omitting the location-scale parameters)

$\hat u_i=F(x_i|\hat\lambda_3,\hat\lambda_4)\qquad i=1,...,n$

in the likelihood function

$\prod_{i=1}^n dfrac{1}{\lambda_3\hat u_i^{\lambda_3-1}+\lambda_4\{1-\hat u_i\}^{\lambda_4-1}}$

A further chapter is dedicated to the generalised beta distribution, which simply is a location-scale transform of the regular beta distribution (even though it is called the extended GLD for no discernible reason). Again, I have nothing for or against this family (except maybe that using a bounded support distribution to approximate infinite support distributions could induce potential drawbacks…) I simply cannot see the point in multiplying parametric families of distributions where there is no compelling property to do so. (Which is also why as an editor/aeditor/referee, I have always been ultra-conservative vis-à-vis papers introducing new families of distributions.)

The R side of the book (i.e. the R in fitting statistical distributions with R) is not particularly appealing either: in the first chapters, i.e. in the first hundred pages, the only reference to R is the name of the R functions found on the attached CD-ROM to fit GLDs by the method of moments or of percentiles… The first detailed code is found on pages 305-309, but it is unfortunately a MATLAB code! (Same thing in several subsequent chapters.) Even though there is an R component to the book thanks to this CD-ROM, the authors could well be suspected of “surfing the R wave” of the Use R! and other “with R”collections. Indeed, my overall feeling is that they are mostly recycling their 2000 book Fitting statistical distributions into this R edition. (For instance, figures that are reproduced from the earlier book, incl. the cover, are not even produced with R. Most entries of the table of contents of Fitting statistical distributions are found in the table of contents of Handbook of fitting statistical distributions with R. The codes were then written in Maple and some Maple codes actually survive in the current version. Most of the novelty in this version is due to the inclusion of chapters written by additional authors.)

“It remains for a future research topic as to how to improve the generalized bootstrap to achieve a 95% confidence interval since 40% on average and 25%-55% still leaves room for improvement.” W. Cai & E.J. Dudewicz, p.852, Handbook of fitting statistical distributions with R

As in the 2000 edition, the “generalised bootstrap” method is argued as an improvement over the regular bootstrap, “fraught with danger of seriously inadequate results” (p.816), and as a mean to provide confidence assessments. This method, attributed to the authors in 1991, is actually a parametric bootstrap used in the context of the GLDs, where samples are generated from the fitted distribution and estimates of the variability of estimators of interest are obtained by a sheer Monte Carlo evaluation! (A repeated criticism of the bootstrap is its “inability to draw samples outside the range of the original dataset” (e.g., p.852). It is somehow ironical that the authors propose to use instead parameterised distributions whose support may be bounded.)

Among the negative features of the book, I want to mention the price ($150!!!), the glaring [for statisticians!] absence of confidence statements about the (moment and percentile) estimations (not to be confused with goodness-of-fit)—except for the much later chapter on generalised bootstrap—, the fact that the book contains more than 250 pages of tables—yes, printed tables!—including a page with a few hundred random numbers generated from a given distribution, the fact that the additional authors who wrote the contributed chapters are not mentioned elsewhere that in the front page of those chapters—not even in the table of contents!—, [once more] the misleading use of the term handbook in the title, the way Wiktionary defines it

handbook (plural handbooks)

A topically organized book of reference on a certain field of knowledge, disregarding the size of it.

as it is not a “reference book”, nor a “topically organised book”: a newcomer opening Handbook of fitting statistical distributions with R cannot expect to find the section that would address her or his fitting problem, but has to read through the (first part) book in a linear way… So there is no redeeming angle there that could lead me to recommend Handbook of fitting statistical distributions with R as fitting any purpose. Save the trees!