A guide to working with character data in R

June 22, 2018
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

HandlingstringscoverR is primarily a language for working with numbers, but we often need to work with text as well. Whether it’s formatting text for reports, or analyzing natural language data, R provides a number of facilities for working with character data. Handling Strings with R, a free (CC-BY-NC-SA) e-book by UC Berkeley’s Gaston Sanchez, provides an overview of the ways you can manipulate characters and strings with R. 

There are many useful sections in the book, but a few selections include:

Note that the book does not cover analysis of natural language data, for which you might want to check out the CRAN Task View on Natural Language Processing or the book Text Mining with R: A Tidy Approach. It’s also sadly silent on the topic of character encoding in R, a topic that often causes problems when dealing with text data, especially from international sources. Nonetheless, the book is a really useful overview of working with text in R, and has been updated extensively since it was last published in 2014. You can read Handling Strings with R at the link below.

Gaston Sanchez: Handling Strings with R

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)