There are some pieces of code that are so simple and obvious that they really ought to be included in base R somewhere.

Geometric mean and standard deviation – a staple for anyone who deals with lognormally distributed data.

geomean <- function(x, na.rm = FALSE, trim = 0, ...)
{
exp(mean(log(x, ...), na.rm = na.rm, trim = trim, ...))
}
geosd <- function(x, na.rm = FALSE, ...)
{
exp(sd(log(x, ...), na.rm = na.rm, ...))
}

A drop option for `nlevels`

. Sure your factor has 99 levels, but how many of them actually crop up in your dataset?

nlevels <- function(x, drop = FALSE) base::nlevels(x[, drop = drop])

A way of converting factors to numbers that is quicker than `as.numeric(as.character(my_factor))`

and easier to remember than the method suggested in the FAQ on R.

factor2numeric <- function(f)
{
if(!is.factor(f)) stop("the input must be a factor")
as.numeric(levels(f))[as.integer(f)]
}

A “not in” operator. Not many people know the precedence rules well enough to know that `!x %in% y`

means `!(x %in% y)`

rather than `(!x) %in% y`

, but `x %!in% y`

should be clear to all.

"%!in%" <- function(x, y) !(x %in% y)

I’m sure there are loads more snippets like this that would be useful to have; please contribute your own in the comments.

Tagged: r

*Related*

To

**leave a comment** for the author, please follow the link and comment on his blog:

** 4D Pie Charts » R**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as: visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** R