I keep forgetting how to select all elements of an object except a few, by name. I get the
! operator confused with the
- operator and I find both of them less than intuitive to use. How can I negate the
I have a data frame called
electrofishing that contains observations from a fish sampling survey. One column,
stratum, gives the aquatic habitat type of the sampling site. I’d like to exclude observations sampled in the “Tailwater Zone” or “Impounded-Offshore” areas.
My instinct would be to do this:
electrofishing <- electrofishing[electrofishing$stratum !%in% c("Tailwater Zone", "Impounded-Offshore"),]
But that doesn’t work. You can’t negate the
%in% operator directly. Instead, you have to wrap the
%in% statement in parentheses and negate the entire statement, returning the opposite of the original boolean vector.
I’m not saying this doesn’t make sense, but I can never remember it. My English-speaking brain would much rather say “rows whose stratum is not included in c(“Tailwater Zone”, “Impounded-Offshore”)” than “not rows whose stratum is included in c(“Tailwater Zone”, “Impounded-Offshore”)”.
Luckily, it’s pretty easy to negate
%in% and create a
%notin% operator. I credit this answer to user catastrophic-failure on this stackoverflow question.
`%notin%` <- Negate(`%in%`)
I didn’t even know that the
Negate function was a thing. The more you know.
I know there are lots of ways to negate selections in R. dplyr has
filter() functions that are easier to use with
-c(). Or I could just learn to throw a
! in front of my
%in% statements. But
%notin% seems a little more intuitive.
Now it’s straightforward to select these rows from my data frame.
electrofishing <- electrofishing[electrofishing$stratum %notin% c("Tailwater Zone", "Impounded-Offshore"),]
This one does a good job of explaining why !%in% doesn’t work: