Introducing GTrendsR

June 12, 2013
By

(This article was first published on Just Another R Blog, and kindly contributed to R-bloggers)

<strong>Just another R blog has beed added to <a href="http://www.r-bloggers.com/" ref="nofollow" target="_blank">r-bloggers</a>!</strong>

In a paper, to be soon published in Conservation Biology and entitle Googling trends in conservation biology, we developed a package named GTrendsR that provides an interface for retrieving and displaying the information returned online by Google Trends in R. Whereas the package is still under development, I'm pleased to present an early overview of GTrendsR. You can download the PDF of the appendix of our paper which presents a detailed tutorial on how to use the package.

This is the first package I develop in R and I probably made some mistakes. So feel free to contact me with suggestions.

To download the article published in Conservation Biology

To download the tutorial file

To download GTrendsR

Before we publish the package, I would like to benifit from your kindly help, dear R users. GTrendsR is based on the RCurl library. At this point everything works perfectly (i.e. the data obtained from R is identical to the data obtained directly from the web site). However, after 5-10 queries I get a “quota excess limit” message. If I log manually on Google Trends web site, it still works (i.e. no quota problems). So, that let me think it must be something related to the way I connect to Google with R. More specifically, I suspect it is something about how I define the connection with curlSetOpt in relation with cookies. I know it might not be obvious, but if someone has an idea :)

Here is the code used to connect to Google Trends.

gConnect = function(usr, psw){

require(RCurl)
require(stringr)

loginURL = "https://accounts.google.com/accounts/ServiceLogin"
authenticateURL = "https://accounts.google.com/accounts/ServiceLoginAuth"

ch = getCurlHandle()

curlSetOpt(curl = ch,
ssl.verifypeer = FALSE,
useragent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13",
timeout = 60,
followlocation = TRUE,
cookiejar = "./cookies",
cookiefile = "./cookies")

## Google Account login
loginPage = getURL(loginURL, curl = ch)

galx.match = str_extract(string = loginPage, pattern = ignore.case('name="GALX"\\s*value="([^"]+)"'))

galx = str_replace(string = galx.match, pattern = ignore.case('name="GALX"\\s*value="([^"]+)"'), replacement = "\\1")

authenticatePage = postForm(authenticateURL, .params = list(Email = usr, Passwd = psw, GALX = galx), curl = ch, .opts = list(verbose = F))

return(ch)
}

Just another R blog has beed added to r-bloggers!

In closing, I am very pleased to announce that this blog has been added to r-bloggers. For those who are new to this blog, I invite you to take a look to this page.

To leave a comment for the author, please follow the link and comment on his blog: Just Another R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.