3 (actually 4) neat R functions

[This article was first published on Maëlle's R blog on Maëlle Salmon's personal website, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Time for me to throw away my sticky note after sharing what I wrote on it!

grep(...) not which(grepl(...))

Recently I caught myself using which(grepl(...)),

animals <- c("cat", "bird", "dog", "fish")
which(grepl("i", animals))
#> [1] 2 4

when the simpler alternative is

animals <- c("cat", "bird", "dog", "fish")
grep("i", animals)
#> [1] 2 4

And should I need the values instead of the indices, I know I shouldn’t write

animals <- c("cat", "bird", "dog", "fish")
animals[grepl("i", animals)]
#> [1] "bird" "fish"

but

animals <- c("cat", "bird", "dog", "fish")
grep("i", animals, value = TRUE)
#> [1] "bird" "fish"

How to remember to use grep()? Re-reading oneself, or having code reviewed, probably helps, but why not automate this? When I shared my note to self on Mastodon, Hugo Gruson explained that detecting usage of which(grepl( was part of planned linters to be added to lintr from Google linting suite. This is excellent news!

strrep() and other defence tools against poor usages of paste()

Yihui Xie wrote a blog post inspired by my own series, where one of the three presented functions was one that was on my sticky note! I’ll still present it: strrep().

strrep() means “string repeat”. Instead of writing

paste(rep("bla", 3), collapse = "")
#> [1] "blablabla"

you can, and should, write

strrep("bla", 3)
#> [1] "blablabla"

I discovered this function because Hugo Gruson telling me about lintr inspired me to skim through lintr reference, where I saw “Raise lints for several common poor usages of paste(). That linter would also tell you when you use paste(, sep = "") instead of paste0().

startsWith() and endsWith()

I learned about startsWith() and endsWith() by reading lintr reference but I also got notified about it when running lintr on a package I was working on. Have you ever tried running all linters on your code? Fun experience. Anyhow, one linter is Require usage of startsWith() and endsWith() over grepl()/substr() versions, with an interesting Details section on missing values.

Instead of writing

animals <- c("cat", "cow", "dog", "fish")
grepl("^c", animals)
#> [1]  TRUE  TRUE FALSE FALSE

I should write

animals <- c("cat", "cow", "dog", "fish")
startsWith(animals, "c")
#> [1]  TRUE  TRUE FALSE FALSE

A nice side-effect of the switch, beyond good practice for its own sake and more readability, is that the argument order is more logical in startsWith().

Similarly, instead of writing

animals <- c("cat", "cow", "dog", "fish")
grepl("t$", animals)
#> [1]  TRUE FALSE FALSE FALSE

I should write

animals <- c("cat", "cow", "dog", "fish")
endsWith(animals, "t")
#> [1]  TRUE FALSE FALSE FALSE

Conclusion

In this post I shared about grep() to be used in lieu of which(grepl()), about strrep() (string repetition) to be used in lieu of paste(rep(), collapse ="") and about startsWith() and endsWith() to be used in lieu of some regular expressions with respectively ^ and $.

To leave a comment for the author, please follow the link and comment on their blog: Maëlle's R blog on Maëlle Salmon's personal website.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)