Making regex examples work for you!

August 30, 2013

(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers)

One of the most frequently used string recognition algorithms out there is regex and R implements regex.  However, users can often be frustrated with how despite taking examples verbatim from many sources such as stackoverflow they do not seem to work.  From my own experience, I have found that the largest issue is really about what characters need to be escaped from R.

For example:

Listing all files whose names match a simple pattern.

Looking at /^.*icon.*\.png$/i” from
I was able to get ^.*icon.*.png$ to work in R though I lost the case insensitivity.  I think including the “^.” ensures that only files in the current directory, not subdirectory are matched but I am not sure.

So, the following code will return a list of file names from the folder Clipart which match the pattern [anything]icon.png

list.files(“C:/Clipart/”, pattern=”^.*icon.*.png$”)
[1] “manicon.png”     “handicon.png”     “bookicon.png”

Looking at the original entry we can see that what was causing us problems was the attempt to escape the “^” which does not need to be escaped in R.
Before looking at another example lets modify the previous command slightly to show how we can make it match differently.
list.files(“C:/Clipart/”, pattern=”^.*icon*.*.png$”)
[1] “manicon.png”     “handicon.png”     “bookicon.png”    “iconnew.png”    
There are a lot of resources available for regex since it is really its own text matching language supported by many different programming languages.  A good introductory guide can be found:

To leave a comment for the author, please follow the link and comment on their blog: Econometrics by Simulation. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)