Calender Heatmap with Google Analytics Data

March 15, 2013
By

(This article was first published on Tatvic Blog » R, and kindly contributed to R-bloggers)

As data analytics consulting firm, we think we are fortunate that we keep finding problems to find. Recently my team mate found a glaring problem of not having any connector for R with Google. With the inspiration from Michael, Ajay O, it soon become a worth problem to solve.

With RGoogleAnalytics package now, we have solved the problem of data extraction into R from Google Analytics a new breed of ideas started emerging primarily around visualization. I have been playing with GGplot2 has been great package to convert data into visualization. Thanks Dr. Hadley Wickham. Once you have Following this blogpost, you are with the code there in position to have data required to get these calendar heat map done. Take up below given code and paste into R console and play around to see if you find it easy working thru. If you have trouble, feel free to reach out to us.

Here is the code for extracting the Google analytics data using R-google analytics package. Before running the following code, download RGoogleAnalytics package and install it.

#Load RGoogleAnalytics library
library("RGoogleAnalytics")
 
# Create query builder object
query <- QueryBuilder()
 
# Authorize your account and paste the accesstoken
access_token <- query$authorize()
 
# Create a new Google Analytics API object
ga <- RGoogleAnalytics()
ga.profiles <- ga$GetProfileData(access_token)
 
# List the GA profiles
ga.profiles  # select index corresponds to your profile and set it to query string
 
# For example if index is 7 of your GA profile then set ga.profiles$id[7] in
# query$Init() method given below
# Build the query string
query$Init(start.date = "2010-01-01", # Set start date
           end.date = "2012-12-31", # Set end date
           dimensions = "ga:date",
           metrics = "ga:visits,ga:transactions",
           max.results = 10000,
           table.id = paste("ga:",ga.profiles$id[6],sep="",collapse=","),
           access_token=access_token)
 
# Make a request to get the data from the API
ga.data <- ga$GetReportData(query)  # data will be stored in this data frame
 
# Set date in format YYYY-MM-DD (to use into heatmap calender)
ga.data$date <- as.Date(as.character(ga.data$date),format="%Y%m%d")

For this example of Calender heatmap, I am using data of an e-commerce store with having data for more than 2 years in business. I will be plotting visits as well as transactions on calendar so that I’d get perspective on how they interact viz-a-viz timeline.

Here is the code for plotting the heat map after you get data and have it store in 'data'. This frame is used to reference the source of data for the visualization below.

# Recommended R version - 2.15.1 or higher
# install required  library by using the command install.packages(“libraryname”)
# For example install.packages(“ggplot2”)
# Required library
library(“quantmod”)
library(“ggplot2”)
library(“reshape2”)
library(“plyr”)
library(“scales”)
 
# Set extracted data  to this data frame
data <-  ga.data
 
# Run commands listed below
data$year <- as.numeric(as.POSIXlt(data$date)$year+1900)
data$month <- as.numeric(as.POSIXlt(data$date)$mon+1)
data$monthf <- factor(data$month,levels=as.character(1:12),
                      labels=c("Jan","Feb","Mar","Apr","May","Jun",
                               "Jul","Aug","Sep","Oct","Nov","Dec"),
                      ordered=TRUE)
data$weekday <- as.POSIXlt(data$date)$wday
data$weekdayf <- factor(data$weekday,levels=rev(0:6),
                        labels=rev(c("Mon","Tue","Wed","Thu","Fri","Sat","Sun")),
                        ordered=TRUE)
data$yearmonth <- as.yearmon(data$date)
data$yearmonthf <- factor(data$yearmonth)
data$week <- as.numeric(format(as.Date(data$date),"%W"))
data <- ddply(data,.(yearmonthf),transform,monthweek=1+week-min(week))
 
# Plot for visits
P_visits <- ggplot(data, aes(monthweek, weekdayf, fill = visits)) +
  geom_tile(colour = "white") +
  facet_grid(year~monthf) +
  scale_fill_gradient(high="#D61818",low="#B5E384") +
  labs(title = "Time-Series Calendar Heatmap") +
  xlab("Week of Month") +
  ylab("")
 
# View plot
P_visits
 
#Plot for transactions
P_transactions <- ggplot(data, aes(monthweek, weekdayf, fill = transactions)) +
  geom_tile(colour = "white") +
  facet_grid(year~monthf) +
  scale_fill_gradient(formatter = "comma",high="#D61818",low="#B5E384") +
  labs(title = "Time-Series Calendar Heatmap") +
  xlab("Week of Month") +
  ylab("")
 
# View plot
P_transactions

Once you run the code, you will be in position to get output like below:

Now that we have a calendar heat map for visits, let me pull it off for transaction. In the above code for Google Analytics data extraction you have use transaction as well as visits as metrics. Since the data is already available in the ‘data’. we are ready by changing in code of visualization to choose the heat map for transaction now.

Its quite interesting now that you can make super nice inferences like I did below:

  • Tuesdays have high visits days but wed has been the day when most transactions occurs
  • Visits increases towards the end of year (shopping season) and then slows down towards year start
Ravi Pathak

Ravi Pathak

Ravi Pathak is a Co-Founder of Tatvic and expert at Managing different web analytics tool. Ravi's actively works on conversion optimization projects to improve conversion rate and test newer hypothesis with e-commerce companies. He regularly tweets via @tatvic or @ravipathak and is easy to have conversation with. Google Plus Profile:

Website - Twitter - Facebook - More Posts

To leave a comment for the author, please follow the link and comment on his blog: Tatvic Blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.