Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

One scenario where R can trip up a programmer is when using the == operator or its relatives. The help page notes that “NA values are regarded as non-comparable”, which introduces some potentially unexpected behavior.

As a toy example, look what happens when trying to subset on a column that includes NA values.
df <- data.frame(a=11:15,b=c(3,NA,4,4,NA))
df
df[df$b==4,] df[df$b<=4,]
In each case, rows with an NA in the b column are returned. This might be surprising and not obvious if wrapped inside of a an aggregation such as nrow or sum. A safer way to accomplish this subsetting is by using the %in% operator. Like so:
df[df\$b %in% 4,]