Unicode in R packages (not)

January 1, 2013
By

(This article was first published on Will Lowe » R, and kindly contributed to R-bloggers)

Perhaps you are trying to add your nice new object as data for an R package. But wait. It has [gasp] foreign letters in its dimnames, so ’R CMD check’ will certainly complain.

What you need is something to turn R’s natural Unicode-processing goodness into a relic from the early days of computing without inadvertently aliasing any words that differ only by non-ASCII element. Here’s a handy iconv-invoking function to do that…

returnTo1963 <- function(x){
iconv(x, from="UTF-8", to='ASCII//TRANSLIT')
}

Transliteration is the key concept here. On a data.frame you can use it like:

dimnames(df) <- lapply(dimnames(df), returnTo1963)