In David Smith’s latest blog post (which, in a sense, is a continued response to the latest public attack on R), there was a comment by Barry that caught my eye. Barry wrote:

Even I get caught out on R quirks after 20 years of using it. **Compare letters[c(12,NA)] and letters[c(NA,NA)]** for the most recent thing that made me bang my head against the wall.

So I did, and here’s the output:

> letters[c(12,NA)]
[1] "l" NA
> letters[c(NA,NA)]
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>

Interesting isn’t it?

I had no clue why this had happened but luckily for us, Barry gave a follow-up reply with an explanation. And here is what he wrote:

My example with ‘letters’ comes from a collision of three features:

- recycling of short subscripts
- silent coercion of types (boolean NA to numeric NA)
- and the existence of five different NA values that all print the same.

[…] to really understand that letters[c(1,NA)] is different from letters[c(NA,NA)] you have to see that:

- in the first case, the NA is coerced to a numeric NA because it’s in a vector with a numeric ‘1′.
- in the first case, you are selecting elements by supplying a vector of indexes
- in the second case, your NAs are boolean (logical) NA values
- hence your subscript is a logical vector
- logical vectors are recycled
- now your subscript is a vector of TRUE/FALSE values (which are all NA) of the same length as ‘letters’.

To make sure I understood Barry correctly, I tried the following code:

> letters[c(T,NA)]
[1] "a" NA "c" NA "e" NA "g" NA "i" NA "k" NA "m" NA "o" NA "q" NA "s" NA "u" NA "w" NA "y" NA

Barry gave this example to illustrate how R violates the Zen idea if: “Simple is better than complex”. Since (so he claims), subscript recycling is shooting you in the foot.

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** R-statistics blog » R**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** code, coercion, indexes, NA, R, recycling, statistics, Zen