The %notin% operator

July 8, 2018
By

(This article was first published on woodpeckR, and kindly contributed to R-bloggers)

Problem

I keep forgetting how to select all elements of an object except a few, by name. I get the ! operator confused with the - operator and I find both of them less than intuitive to use. How can I negate the %in% operator?

Context

I have a data frame called electrofishing that contains observations from a fish sampling survey. One column, stratum, gives the aquatic habitat type of the sampling site. I’d like to exclude observations sampled in the “Tailwater Zone” or “Impounded-Offshore” areas.

My instinct would be to do this:

electrofishing <- electrofishing[electrofishing$stratum !%in% c("Tailwater Zone", "Impounded-Offshore"),]

But that doesn’t work. You can’t negate the %in% operator directly. Instead, you have to wrap the %in% statement in parentheses and negate the entire statement, returning the opposite of the original boolean vector.

I’m not saying this doesn’t make sense, but I can never remember it. My English-speaking brain would much rather say “rows whose stratum is not included in c(“Tailwater Zone”, “Impounded-Offshore”)” than “not rows whose stratum is included in c(“Tailwater Zone”, “Impounded-Offshore”)”.

Solution

Luckily, it’s pretty easy to negate %in% and create a %notin% operator. I credit this answer to user catastrophic-failure on this stackoverflow question.

`%notin%` <- Negate(`%in%`)

I didn’t even know that the Negate function was a thing. The more you know.

Outcome

I know there are lots of ways to negate selections in R. dplyr has select() and filter() functions that are easier to use with -c(). Or I could just learn to throw a ! in front of my %in% statements. But %notin% seems a little more intuitive.

Now it’s straightforward to select these rows from my data frame.

electrofishing <- electrofishing[electrofishing$stratum %notin% c("Tailwater Zone", "Impounded-Offshore"),]

Resources

https://stackoverflow.com/questions/38351820/negation-of-in-in-r

This one does a good job of explaining why !%in% doesn’t work:
http://r.789695.n4.nabble.com/in-operator-NOT-IN-td3506655.html

To leave a comment for the author, please follow the link and comment on their blog: woodpeckR.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)