Quick Hit: which() and match() are not the same

[This article was first published on Gage Theory » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

What’s the difference between using which() and match() in R? For me – about 10 hours!

Today I was doing some string matching in R. In my experience performing any sort of regex or string manipulation in R is a mistake. I’ve never performed benchmarks, but it always seems slower than Perl or Python.

That said, when I’m working in R I’m loathe to switch gears to another language unless the switch isn’t optional. While trying to find a match for a string in a vector of characters I thought I had run into one of those situations. My functions had an estimated run time of 10 hours. Too slow! What was slowing it down?

I tend to default for which() for matching in R because it returns multiple matches. In this particular scenario though, I only needed the 1st match in the sequence and the match() function was perfectly fine.

How much of a difference did it make?

  • which() – about 30 seconds per record
  • match() – about 0.01 seconds per record

The speedup went well beyond the reduction in time for limiting the search to the first match. The underlying implementation of match() is clearly much better than which().

Moral of the story? Note to self: do not use which() when match() will do.

To leave a comment for the author, please follow the link and comment on their blog: Gage Theory » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)