The %notin% operator
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Problem
I keep forgetting how to select all elements of an object except a few, by name. I get the ! operator confused with the - operator and I find both of them less than intuitive to use. How can I negate the %in% operator?
Context
I have a data frame called electrofishing that contains observations from a fish sampling survey. One column, stratum, gives the aquatic habitat type of the sampling site. I’d like to exclude observations sampled in the “Tailwater Zone” or “Impounded-Offshore” areas.
My instinct would be to do this:
electrofishing <- electrofishing[electrofishing$stratum !%in% c("Tailwater Zone", "Impounded-Offshore"),]
But that doesn’t work. You can’t negate the %in% operator directly. Instead, you have to wrap the %in% statement in parentheses and negate the entire statement, returning the opposite of the original boolean vector.
I’m not saying this doesn’t make sense, but I can never remember it. My English-speaking brain would much rather say “rows whose stratum is not included in c(“Tailwater Zone”, “Impounded-Offshore”)” than “not rows whose stratum is included in c(“Tailwater Zone”, “Impounded-Offshore”)”.
Solution
Luckily, it’s pretty easy to negate %in% and create a %notin% operator. I credit this answer to user catastrophic-failure on this stackoverflow question.
`%notin%` <- Negate(`%in%`)
I didn’t even know that the Negate function was a thing. The more you know.
Outcome
I know there are lots of ways to negate selections in R. dplyr has select() and filter() functions that are easier to use with -c(). Or I could just learn to throw a ! in front of my %in% statements. But %notin% seems a little more intuitive.
Now it’s straightforward to select these rows from my data frame.
electrofishing <- electrofishing[electrofishing$stratum %notin% c("Tailwater Zone", "Impounded-Offshore"),]
Resources
https://stackoverflow.com/questions/38351820/negation-of-in-in-r
This one does a good job of explaining why !%in% doesn’t work:
http://r.789695.n4.nabble.com/in-operator-NOT-IN-td3506655.html
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
