Row names in data frames: beware of 1:nrow

[This article was first published on The stupidest thing... » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x, as 1:nrow(x), R will act as if you’d not assigned row names, and the names might get changed when you do rbind.

Here’s an illustration:

> x <- data.frame(id=1:3)
> y <- data.frame(id=4:6)
> rownames(x) <- 1:3
> rownames(y) <- LETTERS[4:6]
> rbind(x,y)
  id
1  1
2  2
3  3
D  4
E  5
F  6
> rbind(y,x)
  id
D  4
E  5
F  6
4  1
5  2
6  3


As you can see, if you give x the row names 1:3, these are treated as generic row numbers and could get changed following rbind if they end up in different rows. This doesn’t happen if x and y are matrices.

I often use row names as identifiers, so it seems I must be cautious to use something other than row numbers.


To leave a comment for the author, please follow the link and comment on their blog: The stupidest thing... » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)