Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There are some pieces of code that are so simple and obvious that they really ought to be included in base R somewhere.

Geometric mean and standard deviation – a staple for anyone who deals with lognormally distributed data.

geomean <- function(x, na.rm = FALSE, trim = 0, ...)
{
exp(mean(log(x, ...), na.rm = na.rm, trim = trim, ...))
}

geosd <- function(x, na.rm = FALSE, ...)
{
exp(sd(log(x, ...), na.rm = na.rm, ...))
}


A drop option for nlevels. Sure your factor has 99 levels, but how many of them actually crop up in your dataset?

nlevels <- function(x, drop = FALSE) base::nlevels(x[, drop = drop])


A way of converting factors to numbers that is quicker than as.numeric(as.character(my_factor)) and easier to remember than the method suggested in the FAQ on R.

factor2numeric <- function(f)
{
if(!is.factor(f)) stop("the input must be a factor")
as.numeric(levels(f))[as.integer(f)]
}


A “not in” operator. Not many people know the precedence rules well enough to know that !x %in% y means !(x %in% y) rather than (!x) %in% y, but x %!in% y should be clear to all.

"%!in%" <- function(x, y) !(x %in% y)