Converting cross sectional data with dates to weekly averages in R.

May 30, 2012
By

(This article was first published on NERD PROJECT » R project posts, and kindly contributed to R-bloggers)

I was recently confronted with a problem where I had to compare two very different data sets. The problem was that one data set was observed cross sectional data with dates over the course of three months and the other was weekly averages during those same three months.  After a bit of research, I discovered that there is very simple way to convert the data in R.

First we’ll create some sample date with randomly generated dates within our time frame:

first <- as.Date("2012/01/25", "%Y/%m/%d")##start date##
last <- as.Date("2012/05/11", "%Y/%m/%d")##end day##
dt <- last-first
nSamples <- 1000
set.seed(1)
date<-as.Date(round(first+
runif(nSamples)*as.numeric(dt)))

Then we will combine with randomly generated values:

value<-sample(1:10, size=1000, replace=TRUE)
data<-data.frame(value, date)

Now that we have our observations we can move onto finding the weekly averages. However our weekly average data starts with the week ending 1/30/2012 which is a Tuesday, so you have to assign that date to everyday in that week using the lubridate package:

library("lubridate")
data$week<-floor_date(data$date,”week”) +8

The “+8” is because floor_date goes to the previous sunday, and we need it to go the following Tuesday.

Now we can use ddply function from the ply package to find the averages from every week:

library("plyr")
x<-ddply(data, .(week), function(z) mean(z$value))

The ddply function finds the averages of all values within each particular week in the data.

The hard work is now all done, but we will need to rename the columns before calling it done:

colnames(x) <- c("week", "value")

You now have the weekly averages to compare to the other dataset.


To leave a comment for the author, please follow the link and comment on his blog: NERD PROJECT » R project posts.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags:

Comments are closed.