Split strings based on a character in the string

December 11, 2012
By

(This article was first published on Software for Exploratory Data Analysis and Statistical Modelling » R Environment, and kindly contributed to R-bloggers)

R has various facilities for string manipulation including the strsplit function to divide a string into substrings based on matching to another string.

A simple example is shown below

> strsplit("<td class=\"objectName\"><a href=\"/path/test.html\"
  target=\"\" title=\"An Object\" class=\"myObject\">Stuff</a></td>", "<")
[[1]]
[1] ""
[2] "td class=\"objectName\">"
[3] "a href=\"/path/test.html\" target=\"\" title=\"An Object\"
  class=\"myObject\">Stuff"
[4] "/a>"
[5] "/td>"

This is a basic example and there are many ways the strsplit function could be combined with other string handling operations for processing text streams.

To leave a comment for the author, please follow the link and comment on his blog: Software for Exploratory Data Analysis and Statistical Modelling » R Environment.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.