Unicode in R packages (not)

January 1, 2013
By

(This article was first published on Will Lowe » R, and kindly contributed to R-bloggers)

Perhaps you are trying to add your nice new object as data for an R package. But wait. It has [gasp] foreign letters in its dimnames, so ’R CMD check’ will certainly complain.

What you need is something to turn R’s natural Unicode-processing goodness into a relic from the early days of computing without inadvertently aliasing any words that differ only by non-ASCII element. Here’s a handy iconv-invoking function to do that…

returnTo1963 <- function(x){
    iconv(x, from="UTF-8", to='ASCII//TRANSLIT')
}

Transliteration is the key concept here. On a data.frame you can use it like:

dimnames(df) <- lapply(dimnames(df), returnTo1963)

To leave a comment for the author, please follow the link and comment on his blog: Will Lowe » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.