Why does R drop attributes when subsetting?
[This article was first published on R on Jorge Cimentada, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I had to spend about 1 hour yesterday because R did something completely unpredictable (for my taste). It dropped an attribute without a warning.
df <- data.frame(x = rep(c(1, 2), 20)) attr(df$x, "label") <- "This is clearly a label" df$x ## [1] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 ## [36] 2 1 2 1 2 ## attr(,"label") ## [1] "This is clearly a label"
The label is clearly there. To my surprise, if I subset this data frame, R drops the attribute.
new_df <- df[df$x == 2, , drop = FALSE] new_df$x ## [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
It doesn’t matter if it’s using bracket subsetting ([
) or subset
.
new_df <- subset(df, x == 2) new_df$x ## [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
That’s not good. R’s dropping attributes silently. For my specific purpose I ended up using dplyr::filter
which safely enough preserves attributes.
library(dplyr) df %>% filter(df, x == 2) %>% pull(x) ## [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ## attr(,"label") ## [1] "This is clearly a label"
To leave a comment for the author, please follow the link and comment on their blog: R on Jorge Cimentada.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.