Google Insights and RCurl

December 20, 2010

(This article was first published on Dan Knoepfle's Blog, and kindly contributed to R-bloggers)

Google Insights is nifty. If you’re logged in to your Google account, you can download the results as a CSV file. This is straightforward if you’re using a browser; if you’re trying to retrieve the results of queries using R, however, things get more complicated.

The following code retrieves the results of a Google Insights search for “Sarah Palin” as a data.frame. It uses the RCurl package to do all of the hard work.

username <- "[email protected]"
password <- "password_here"

loginURL <- ""
authenticateURL <- ""


ch <- getCurlHandle()

curlSetOpt(curl = ch,
            ssl.verifypeer = FALSE,
            useragent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv: Gecko/20101203 Firefox/3.6.13",
            timeout = 60,
            followlocation = TRUE,
            cookiejar = "./cookies",
            cookiefile = "./cookies")

## do Google Account login
loginPage <- getURL(loginURL, curl = ch)

galx.match <- str_extract(string = loginPage,
                          pattern ='name="GALX"\\s*value="([^"]+)"'))
galx <- str_replace(string = galx.match,
                    pattern ='name="GALX"\\s*value="([^"]+)"'),
                    replacement = "\\1")

authenticatePage <- postForm(authenticateURL, .params = list(Email = username, Passwd = password, GALX = galx), curl = ch)

## get Google Insights results CSV
insightsURL <- ""
resultsText <- getForm(insightsURL, .params = list(q = "Sarah Palin", cmpt = "q", content = 1, export = 1), curl = ch)

if(isTRUE(unname(attr(resultsText, "Content-Type")[1] == "text/csv"))) {
  ## got CSV file

  ## create temporary connection from results
  tt <- textConnection(resultsText)

  resultsCSV <- read.csv(tt, header = FALSE)

  ## close connection
} else {
  ## something went wrong

  ## probably need to log in again?


download ‘Google Insights.R’ from

I don’t have much else to say about this, but I hope that it will be helpful to someone.

You can change the query to incorporate geographic restrictions or such by adding the parameters that appear in the URL when you change your search through the Google Insights web search; for instance, a basic search for “QUERY” gives URL whereas the same search restricted to the state of New York has URL; the added parameter is “geo=US-NY”. To incorporate this into the script, change

resultsText <- getForm(insightsURL, .params = list(q = "Sarah Palin", cmpt = "q", content = 1, export = 1), curl = ch)

to have the additional parameter in the .params list:

resultsText <- getForm(insightsURL, .params = list(q = "Sarah Palin", cmpt = "q", geo = "US-NY", content = 1, export = 1), curl = ch)

[Updated 2012-04-24]

To leave a comment for the author, please follow the link and comment on their blog: Dan Knoepfle's Blog. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)