renaming data frame columns in lists

July 24, 2012
By

(This article was first published on metvurst, and kindly contributed to R-bloggers)

Renaming the columns of data frames which are stored in lists of lists

Renaming the columns of data frames which are stored in lists of lists

OK, so the scenario is as follows:

  • we have a list of 2 elements which in turn are again lists with 2 elements (each of which is a data frame).
  • None of the elements in question carry names (neither the list entries nor the data frames)
  • we want to only set the names of the data frames that are buried 2 levels down the main list

First we create some mock data that resembles the scenario (mimicking temperature and relative humidity observations during January and February 2010)

## create 2 mock months
date_jan <- as.Date(seq(1, 31, 1), origin = "2010-01-01")
date_feb <- as.Date(seq(1, 28, 1), origin = "2010-02-01")

## create mock observations for the months
Ta_200_jan <- rnorm(31, 10, 3)
Ta_200_feb <- rnorm(28, 11, 3)
rH_200_jan <- rnorm(31, 75, 10)
rH_200_feb <- rnorm(28, 70, 10)


df1 <- data.frame(V1 = date_jan, V2 = Ta_200_jan)
df2 <- data.frame(V1 = date_jan, V2 = rH_200_jan)
df3 <- data.frame(V1 = date_feb, V2 = Ta_200_feb)
df4 <- data.frame(V1 = date_feb, V2 = rH_200_feb)

lst <- list(list(df1, df2), list(df3, df4))

So now we have a list of two elements which are again a list of 2 which is made up of 2 data frames each.
None of these elements are named (actually the columns of the data frames are named V1 and V2 - which is not very informative).

This is what the list structure looks like:

str(lst)
## List of 2
## $ :List of 2
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ V1: Date[1:31], format: "2010-01-02" ...
## .. ..$ V2: num [1:31] 8.79 4.55 8.11 16.58 9.71 ...
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ V1: Date[1:31], format: "2010-01-02" ...
## .. ..$ V2: num [1:31] 81.1 67.4 75.8 64.7 73.9 ...
## $ :List of 2
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ V1: Date[1:28], format: "2010-02-02" ...
## .. ..$ V2: num [1:28] 16.5 15.47 13.38 18.46 8.96 ...
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ V1: Date[1:28], format: "2010-02-02" ...
## .. ..$ V2: num [1:28] 67.3 71.8 87.3 75.3 80.7 ...

Now we define the names to set

name.x <- c("Date")
name.y <- c("Ta_200", "rH_200")

And finally, we use lapply() to recursively set the column names of the data frames within the list of lists
The crux is to define a data frame (y) at iteration 2 which is subsequently returned (and as lapply() always returns a list, we again get a list of lists)

lst <- lapply(seq(lst), function(i) {
lapply(seq(name.y), function(j) {
y <- data.frame(lst[[i]][[j]])
names(y) <- c(name.x, name.y[j])
return(y)
})
})

And this is what we end up with:

str(lst)
## List of 2
## $ :List of 2
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ Ta_200: num [1:31] 8.79 4.55 8.11 16.58 9.71 ...
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ rH_200: num [1:31] 81.1 67.4 75.8 64.7 73.9 ...
## $ :List of 2
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ Ta_200: num [1:28] 16.5 15.47 13.38 18.46 8.96 ...
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ rH_200: num [1:28] 67.3 71.8 87.3 75.3 80.7 ...

Problem solved!

we now have a list of lists with named columns for each data frame with correct labels for date and parameter of the observations!

PS: if you wanted to name the first level entries of the list according to the month of observation, this would do the job:

names(lst) <- c("January", "February")

str(lst)
## List of 2
## $ January :List of 2
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ Ta_200: num [1:31] 8.79 4.55 8.11 16.58 9.71 ...
## ..$ :'data.frame': 31 obs. of 2 variables:
## .. ..$ Date : Date[1:31], format: "2010-01-02" ...
## .. ..$ rH_200: num [1:31] 81.1 67.4 75.8 64.7 73.9 ...
## $ February:List of 2
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ Ta_200: num [1:28] 16.5 15.47 13.38 18.46 8.96 ...
## ..$ :'data.frame': 28 obs. of 2 variables:
## .. ..$ Date : Date[1:28], format: "2010-02-02" ...
## .. ..$ rH_200: num [1:28] 67.3 71.8 87.3 75.3 80.7 ...

I leave it up to your imagination how to set the names of the second level list entries…

sessionInfo()
## R version 2.15.1 (2012-06-22)
## Platform: x86_64-pc-mingw32/x64 (64-bit)
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats grDevices utils datasets grid graphics methods
## [8] base
##
## other attached packages:
## [1] knitr_0.6.3 raster_1.9-92 sp_0.9-99
## [4] reshape_0.8.4 plyr_1.7.1 latticeExtra_0.6-19
## [7] lattice_0.20-6 RColorBrewer_1.0-5
##
## loaded via a namespace (and not attached):
## [1] digest_0.5.2 evaluate_0.4.2 formatR_0.5 parser_0.0-16
## [5] Rcpp_0.9.13 stringr_0.6 tools_2.15.1

To leave a comment for the author, please follow the link and comment on his blog: metvurst.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.