Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post was inspired by some musings from John Bollinger that as data in the financial world wasn’t normally distributed, that there might be a more robust computation to indicate skewness and kurtosis. For instance, one way to think about skewness is the difference between mean and median. That is, if the mean is less than the median, that the distribution was left skewed, and vice versa.

This post attempts to extend that thinking to kurtosis. That is, just as the skew can be thought of as a relationship between mean and median, so too, might kurtosis be thought of as a relationship between two measures of spread–standard deviation and the more robust interquartile range. So, I performed an experiment to simulate 10000 observations from a standard normal and 10000 observations from a standard double-exponential distribution.

Here’s the experiment I ran.

set.seed(1234)
norms <- rnorm(10000)
dexps <- rexp(10000) * sign(rnorm(10000))
plot(density(dexps))
lines(density(norms), col="red")
(IQR(norms))
(IQR(dexps))
(sd(norms))
(sd(dexps))
(sd(norms)/IQR(norms))
(sd(dexps)/IQR(dexps))


And here’s the output:

[1] 0.9757966
> (IQR(norms))
[1] 1.330469
> (IQR(dexps))
[1] 1.35934
> (sd(norms))
[1] 0.9875294
> (sd(dexps))
[1] 1.393057
> (sd(norms)/IQR(norms))
[1] 0.7422415
> (sd(dexps)/IQR(dexps))
[1] 1.024804


That is, in a distribution with higher kurtosis than the standard normal, that the ratio between standard deviation to interquartile range is higher in a heavier-tailed distribution. I’m not certain that this assertion is true in all general cases, but it seems to make intuitive sense, that with heavier tails, the same amount of observations are more spread out.