# STATA: Regular expressions

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A regular expression allows you to do a moderately fancy search (and replace if you want). So say you wanted to replace all the “Dennis”s in a variable with “Awesome”s, but only if they’re at the end of the line. You could try:**The Data Monkey**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

-replace PBFnamevar = regexr(PBFnamevar,”Dennis$”,”Awesome”)-

You could also replace any character, or just capitals, or just digits…there are lots of possibilities:

http://www.stata.com/support/faqs/data/regex.html

You can also use it for locals:

-local strata = regexr(“agecat”,”age”)-

Or -if- commands:

if regexm(“`strata'”,”age”) {

}

On a related note (although not actually regular expressions), say that you’ve got a string variable that consists of a bunch of what should be separate variables, only lumped all into one, separated by a semicolon (e.g. a row might look like “1;15.2;89;hi;21”). Try -split-:

-split textvar, gen(newtextvars) parse(“;”)-

I should note that Stata’s regular expressions are wimpy compared to what other languages support. R supports PERL regular expressions, which can do so many things it’s scary.

To

**leave a comment**for the author, please follow the link and comment on their blog:**The Data Monkey**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.