I plot the frequency of wikipedia searches of “Behavioral Economics”, and “Beer” – who knew the correlation would be 0.7!
Data on any wikipedia searches (back to 2007) are available at http://glimmer.rstudio.com/pssguy/wikiSearchRates/. The website allows you to download frequency hits per day as a csv, which is what I've done here.
# Behavioral Economics and Beer:
# Author: Mark T Patterson Date: March 18, 2013
# Clear Workbench:
rm(list = ls())
## Find out what's changed in ggplot2 with
## news(Version == "0.9.1", package = "ggplot2")
curr.wd = getwd()
ts = read.csv("BehavEconBeer.csv", header = TRUE)
# cleaning the dataset: str(ts)
ts$date = as.character(ts$date)
ts$date = mdy(ts$date)
## Using date format %m/%d/%Y.
ts = ts[, -1]
Note: the mdy function is in the lubridate package, which cleanly handles time/date data. I've eliminated the first column of data, which just gives row names inherited from excel.
p = ggplot(ts, aes(x = date, y = count)) + geom_line(aes(color = factor(name)),
size = 2)
It turns out the pattern we observe isn't at all unique – many variables follow (predictable) patterns of variation through the week. This doesn't necessarily mean, though, that the correlation between beer and behavioral economics is entirely spurious!