(This article was first published on

A regular expression allows you to do a moderately fancy search (and replace if you want). So say you wanted to replace all the "Dennis"s in a variable with "Awesome"s, but only if they're at the end of the line. You could try:**The Data Monkey**, and kindly contributed to R-bloggers)-replace PBFnamevar = regexr(PBFnamevar,"Dennis$","Awesome")-

You could also replace any character, or just capitals, or just digits...there are lots of possibilities:

http://www.stata.com/support/faqs/data/regex.html

You can also use it for locals:

-local strata = regexr("agecat","age")-

Or -if- commands:

if regexm("`strata'","age") {

}

On a related note (although not actually regular expressions), say that you've got a string variable that consists of a bunch of what should be separate variables, only lumped all into one, separated by a semicolon (e.g. a row might look like "1;15.2;89;hi;21"). Try -split-:

-split textvar, gen(newtextvars) parse(";")-

I should note that Stata's regular expressions are wimpy compared to what other languages support. R supports PERL regular expressions, which can do so many things it's scary.

To

**leave a comment**for the author, please follow the link and comment on his blog:**The Data Monkey**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...