Introducing GTrendsR

[This article was first published on Just Another R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

<strong>Just another R blog has beed added to <a href="https://www.r-bloggers.com/" rel="nofollow" target="_blank">r-bloggers</a>!</strong>

In a paper, to be soon published in Conservation Biology and entitle Googling trends in conservation biology, we developed a package named GTrendsR that provides an interface for retrieving and displaying the information returned online by Google Trends in R. Whereas the package is still under development, I'm pleased to present an early overview of GTrendsR. You can download the PDF of the appendix of our paper which presents a detailed tutorial on how to use the package.

This is the first package I develop in R and I probably made some mistakes. So feel free to contact me with suggestions.

To download the article published in Conservation Biology

To download the tutorial file

To download GTrendsR

Before we publish the package, I would like to benifit from your kindly help, dear R users. GTrendsR is based on the RCurl library. At this point everything works perfectly (i.e. the data obtained from R is identical to the data obtained directly from the web site). However, after 5-10 queries I get a “quota excess limit” message. If I log manually on Google Trends web site, it still works (i.e. no quota problems). So, that let me think it must be something related to the way I connect to Google with R. More specifically, I suspect it is something about how I define the connection with curlSetOpt in relation with cookies. I know it might not be obvious, but if someone has an idea :)

Here is the code used to connect to Google Trends.

gConnect = function(usr, psw){

  require(RCurl)
  require(stringr)

  loginURL = "https://accounts.google.com/accounts/ServiceLogin"
  authenticateURL = "https://accounts.google.com/accounts/ServiceLoginAuth"

  ch = getCurlHandle()

  curlSetOpt(curl = ch,
             ssl.verifypeer = FALSE,
             useragent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13",
             timeout = 60,
             followlocation = TRUE,
             cookiejar = "./cookies",
             cookiefile = "./cookies")

  ## Google Account login
  loginPage = getURL(loginURL, curl = ch)

  galx.match = str_extract(string = loginPage, pattern = ignore.case('name="GALX"\\s*value="([^"]+)"'))

  galx = str_replace(string = galx.match, pattern = ignore.case('name="GALX"\\s*value="([^"]+)"'), replacement = "\\1")

  authenticatePage = postForm(authenticateURL, .params = list(Email = usr, Passwd = psw, GALX = galx), curl = ch, .opts = list(verbose = F))

  return(ch)
}

Just another R blog has beed added to r-bloggers!

In closing, I am very pleased to announce that this blog has been added to r-bloggers. For those who are new to this blog, I invite you to take a look to this page.

To leave a comment for the author, please follow the link and comment on their blog: Just Another R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)