Reading data from google sheets into R

November 22, 2016
By

(This article was first published on R – Opiniomics, and kindly contributed to R-bloggers)

Reading data from google sheets into R is something you imagine should be really simple, but often is anything but. However, package googlesheets goes a long way to solving this problem.

Let’s crack on with an example.

First, install the software:

install.packages("googlesheets")

We then need an example sheet to work with, and I’m going to use one from Britain Elects:

So go here and add this to your own collection (yes you’ll need a google account!) using the little “Add to my drive” icon to the right of the title, just next to the star.

Right, let’s play around a bit!

# load package
library(googlesheets)

# which google sheets do you have access to?
# may ask you to authenticate in a browser!
gs_ls()

# get the Britain Elects google sheet
be <- gs_title("Britain Elects / Public Opinion")

This should show you all of the sheets you have access to, and also select the one we want.

Now we can see which worksheets exist within the sheet:

# list worksheets
gs_ws_ls(be)

We can "download" one of the sheets using gs_read()

# get Westminster voting
west <- gs_read(ss=be, ws = "Westminster voting intentions", skip=1)

# convert to data.frame
wdf <- as.data.frame(west)

And hey presto, we have the data. Simples!

Now let's plot it:

# reverse so that data are forward-sorted by time
wdf <- wdf[2769:1,]

# treat all dates as a single point on x-axis
dates <- 1:2769

# smooth parameter
fs <- 0.04

# plot conservative
plot(lowess(dates, wdf$Con, f=fs), type="l", col="blue", lwd=5, ylim=c(0,65), xaxt="n", xlab="", ylab="%", main="Polls: Westminster voting intention")

# add labels
axis(side=1, at=dates[seq(1, 2769, by=40)], labels=paste(wdf$"Fieldwork end date", wdf$Y)[seq(1, 2769, by=40)], las=2, cex.axis=0.8)

# plot labour and libdem
lines(lowess(dates, wdf$Lab, f=fs), col="red", lwd=5)
lines(lowess(dates, wdf$LDem, f=fs), col="orange", lwd=5)

# add UKIP, and we treat absent values as -50 for plotting purposes
ukip <- wdf$UKIP
ukip[is.na(ukip)] <- -50
lines(lowess(dates, ukip, f=fs), col="purple", lwd=5)

# add legend
legend(1, 65, legend=c("Con","Lab","LibDem","UKIP"), fill=c("blue","red","orange","purple"))


westminster

To leave a comment for the author, please follow the link and comment on their blog: R – Opiniomics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)