**R – Statistical Odds & Ends**, and kindly contributed to R-bloggers)

This month’s issue of *Significance magazine* has a very nice summary article of the * sinh-arcsinh normal distribution*. (Unfortunately, the article seems to be behind a paywall.)

This distribution was first introduced by Chris Jones and Arthur Pewsey in 2009 as a generalization of the normal distribution. While the normal distribution is symmetric and has light to moderate tails and can be defined by just two parameters ( for location and for scale), the sinh-arcsinh distribution has two more parameters which control asymmetry and tail weight.

Given the 4 parameters, the sinh-arcsinh normal distribution is defined as

where and are the hyperbolic sine function and its inverse.

- controls the location of the distribution (where it is “centered” at),
- controls the scale (the larger it is, the more spread out the distribution is),
- controls the asymmetry of the distribution (can be any real value, more positive means more right skew, more negative means more left skew), and
- controls tail weight (any positive real value,

From the expression, we can also see that when

In R, the `gamlss.dist`

package provides functions for plotting this distribution. The package provides functions for 3 different parametrizations of this distribution; the parametrization above corresponds to the `SHASHo`

set of functions. As is usually the case in R, `dSHASHo`

, `pSHASHo`

, `qSHASHo`

and `rSHASHo`

are for the density, distribution function, quantile function and random generation for the distribution.

First, we demonstrate the effect of skewness (i.e. varying

library(gamlss.dist) library(dplyr) library(ggplot2) x <- seq(-6, 6, length.out = 301) nu_list <- -3:3 df <- data.frame() for (nu in nu_list) { temp_df <- data.frame(x = x, y = dSHASHo2(x, mu = 0, sigma = 1, nu = nu, tau = 1)) temp_df$nu <- nu df <- rbind(df, temp_df) }

As

df %>% filter(nu >= 0) %>% ggplot(aes(x = x, y = y, col = factor(nu))) + geom_line() + theme_bw()

As

df %>% filter(nu <= 0) %>% ggplot(aes(x = x, y = y, col = factor(nu))) + geom_line() + theme_bw()

Next, we demonstrate the effect varying

tau_list <- c(0.25, 0.75, 1, 1.5) df <- data.frame() for (tau in tau_list) { temp_df <- data.frame(x = x, y = dSHASHo(x, mu = 0, sigma = 1, nu = 0, tau = tau)) temp_df$tau <- tau df <- rbind(df, temp_df) } ggplot(data = df, aes(x = x, y = y, col = factor(tau))) + geom_line() + theme_bw()

By changing `nu = 0`

to `nu = 1`

in the code above, we see the effect of tail weight when there is skewness:

(**Note:** For reasons unclear to me, the *Significance* article uses different symbols for the 4 parameters:

The authors note that it is possible to perform maximum likelihood estimation with this distribution. It is an example of GAMLSS regression, which can be performed in R using the `gamlss`

package.

References:

- Jones, C. and Pewsey, A. (2019). The sinh-arcsinh normal distribution.
- Jones, M. C. and Pewsey, A. (2009). Sinh-arcsinh distributions.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Statistical Odds & Ends**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...