Why does R drop attributes when subsetting?

[This article was first published on R on Jorge Cimentada, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I had to spend about 1 hour yesterday because R did something completely unpredictable (for my taste). It dropped an attribute without a warning.

df <- data.frame(x = rep(c(1, 2), 20))

attr(df$x, "label") <- "This is clearly a label"

df$x
##  [1] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1
## [36] 2 1 2 1 2
## attr(,"label")
## [1] "This is clearly a label"

The label is clearly there. To my surprise, if I subset this data frame, R drops the attribute.

new_df <- df[df$x == 2, , drop = FALSE]

new_df$x
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

It doesn’t matter if it’s using bracket subsetting ([) or subset.

new_df <- subset(df, x == 2)

new_df$x
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

That’s not good. R’s dropping attributes silently. For my specific purpose I ended up using dplyr::filter which safely enough preserves attributes.

library(dplyr)

df %>% 
  filter(df, x == 2) %>% 
  pull(x)
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
## attr(,"label")
## [1] "This is clearly a label"

To leave a comment for the author, please follow the link and comment on their blog: R on Jorge Cimentada.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)