renaming data frame columns in lists

[This article was first published on metvurst, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

OK, so the scenario is as follows:

  • we have a list of 2 elements which in turn are again lists with 2 elements (each of which is a data frame).
  • None of the elements in question carry names (neither the list entries nor the data frames)
  • we want to only set the names of the data frames that are buried 2 levels down the main list

First we create some mock data that resembles the scenario (mimicking temperature and relative humidity observations during January and February 2010)

## create 2 mock months
date_jan <- as.Date(seq(1, 31, 1), origin = "2010-01-01")
date_feb <- as.Date(seq(1, 28, 1), origin = "2010-02-01")

## create mock observations for the months
Ta_200_jan <- rnorm(31, 10, 3)
Ta_200_feb <- rnorm(28, 11, 3)
rH_200_jan <- rnorm(31, 75, 10)
rH_200_feb <- rnorm(28, 70, 10)


df1 <- data.frame(V1 = date_jan, V2 = Ta_200_jan)
df2 <- data.frame(V1 = date_jan, V2 = rH_200_jan)
df3 <- data.frame(V1 = date_feb, V2 = Ta_200_feb)
df4 <- data.frame(V1 = date_feb, V2 = rH_200_feb)

lst <- list(list(df1, df2), list(df3, df4))

So now we have a list of two elements which are again a list of 2 which is made up of 2 data frames each.
None of these elements are named (actually the columns of the data frames are named V1 and V2 – which is not very informative).

This is what the list structure looks like:

str(lst)

## List of 2
##  $ :List of 2
##   ..$ :'data.frame': 31 obs. of  2 variables:
##   .. ..$ V1: Date[1:31], format: "2010-01-02" ...
##   .. ..$ V2: num [1:31] 9.95 15.49 9.45 12.16 8.84 ...
##   ..$ :'data.frame': 31 obs. of  2 variables:
##   .. ..$ V1: Date[1:31], format: "2010-01-02" ...
##   .. ..$ V2: num [1:31] 70.4 87.6 69.6 80.2 59 ...
##  $ :List of 2
##   ..$ :'data.frame': 28 obs. of  2 variables:
##   .. ..$ V1: Date[1:28], format: "2010-02-02" ...
##   .. ..$ V2: num [1:28] 11.95 8.42 13.06 9.55 10.76 ...
##   ..$ :'data.frame': 28 obs. of  2 variables:
##   .. ..$ V1: Date[1:28], format: "2010-02-02" ...
##   .. ..$ V2: num [1:28] 78.7 63.9 62.6 67.5 73.5 ...

Now we define the names to set

name.x <- c("Date")
name.y <- c("Ta_200", "rH_200")

And finally, we use lapply() to recursively set the column names of the data frames within the list of lists
The crux is to define a data frame (y) at iteration 2 which is subsequently returned (and as lapply() always returns a list, we again get a list of lists)

lst <- lapply(seq(lst), function(i) {
    lapply(seq(name.y), function(j) {
        y <- data.frame(lst[[i]][[j]])
        names(y) <- c(name.x, name.y[j])
        return(y)
    })
})

And this is what we end up with:

str(lst)

## List of 2
##  $ :List of 2
##   ..$ :'data.frame': 31 obs. of  2 variables:
##   .. ..$ Date  : Date[1:31], format: "2010-01-02" ...
##   .. ..$ Ta_200: num [1:31] 9.95 15.49 9.45 12.16 8.84 ...
##   ..$ :'data.frame': 31 obs. of  2 variables:
##   .. ..$ Date  : Date[1:31], format: "2010-01-02" ...
##   .. ..$ rH_200: num [1:31] 70.4 87.6 69.6 80.2 59 ...
##  $ :List of 2
##   ..$ :'data.frame': 28 obs. of  2 variables:
##   .. ..$ Date  : Date[1:28], format: "2010-02-02" ...
##   .. ..$ Ta_200: num [1:28] 11.95 8.42 13.06 9.55 10.76 ...
##   ..$ :'data.frame': 28 obs. of  2 variables:
##   .. ..$ Date  : Date[1:28], format: "2010-02-02" ...
##   .. ..$ rH_200: num [1:28] 78.7 63.9 62.6 67.5 73.5 ...

Problem solved!

we now have a list of lists with named columns for each data frame with correct labels for date and parameter of the observations!

PS: if you wanted to name the first level entries of the list according to the month of observation, this would do the job:

names(lst) <- c("January", "February")

str(lst)

## List of 2
##  $ January :List of 2
##   ..$ :'data.frame': 31 obs. of  2 variables:
##   .. ..$ Date  : Date[1:31], format: "2010-01-02" ...
##   .. ..$ Ta_200: num [1:31] 9.95 15.49 9.45 12.16 8.84 ...
##   ..$ :'data.frame': 31 obs. of  2 variables:
##   .. ..$ Date  : Date[1:31], format: "2010-01-02" ...
##   .. ..$ rH_200: num [1:31] 70.4 87.6 69.6 80.2 59 ...
##  $ February:List of 2
##   ..$ :'data.frame': 28 obs. of  2 variables:
##   .. ..$ Date  : Date[1:28], format: "2010-02-02" ...
##   .. ..$ Ta_200: num [1:28] 11.95 8.42 13.06 9.55 10.76 ...
##   ..$ :'data.frame': 28 obs. of  2 variables:
##   .. ..$ Date  : Date[1:28], format: "2010-02-02" ...
##   .. ..$ rH_200: num [1:28] 78.7 63.9 62.6 67.5 73.5 ...

I leave it up to your imagination how to set the names of the second level list entries…

sessionInfo()

## R version 2.15.2 (2012-10-26)
## Platform: x86_64-pc-linux-gnu (64-bit)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=C                 LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] knitr_1.1
## 
## loaded via a namespace (and not attached):
## [1] digest_0.6.3   evaluate_0.4.3 formatR_0.7    stringr_0.6.2 
## [5] tools_2.15.2

To leave a comment for the author, please follow the link and comment on their blog: metvurst.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)