the random variable that was always less than its mean…

May 29, 2016

(This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers)

Although this is far from a paradox when realising why the phenomenon occurred, it took me a few lines to understand why the empirical average of a log-normal sample is apparently a biased estimator of its mean. And why the biased plug-in estimator does not appear to present a bias. The picture below compares two estimators of the mean of a log-normal LN(0,σ²) distribution when σ² increases: blue stands for the empirical mean, while gold corresponds to the plug-in estimator exp(σ²/2) when σ² is estimated from the log-sample. (The sample is of size 10⁶.)

The question came on X validated and my first reaction was to doubt the implementation which outcome was so counter-intuitive. But then I thought about the representation of a log-normal variate as exp(σξ) when ξ is a standard Normal variate. When σ grows large enough, it is near impossible for σξ to be larger than σ². More precisely,


which can be arbitrarily small.

Filed under: Books, Kids, R, Statistics Tagged: cross validated, empirical cdf, Gumbel distribution, R, skewed distribution, Stack Exchange

To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)