Monthly Archives: November 2011

Web Scraping Google Scholar: Part 2 (Complete Success)

November 8, 2011
By
Web Scraping Google Scholar: Part 2 (Complete Success)

This is a followup to a post I uploaded earlier today about web scraping data off Google Scholar. In that post I was frustrated because I’m not smart enough to use xpathSApply to get the kind of results I wanted. However fast-forward to the evening whilst having dinner with a friend, as a passing remark,

Read more »

What the frack? Does hydraulic fracturing lead to increased earthquakes?

November 8, 2011
By
What the frack?  Does hydraulic fracturing lead to increased earthquakes?

Earthquakes are normal occurrences along the boundaries of major plate margins, such as along the San Andreas fault system of California,  and are less common within plate interiors.  Try telling that, however, to the citizens of Oklahoma who...

Read more »

Three free books on R for Statistics

November 8, 2011
By

Avril Coghlan, a lecturer at University College Cork in Ireland, has written and made available for free three books ideal for students or practitioners new to R who want to use it for multivariate analysis, time series analysis or biomedical statistics. Each book begins with practical advice for installing and using R in general, before diving into their specialized...

Read more »

Error Handling in Lyx & Sweave: using Quantmod (and R, of course)

November 8, 2011
By

I do reports for clients with LyX and Sweave. It took me an extremely long time to get them working, but now that they’re working I can do more in an hour and thus charge more per hour. (Which is, like, the point.) If you’re not familiar, here’s ...

Read more »

Error Handling in Lyx & Sweave: using Quantmod (and R, of course)

November 8, 2011
By

I do reports for clients with LyX and Sweave. It took me an extremely long time to get them working, but now that they’re working I can do more in an hour and thus charge more per hour. If you’re not familiar, here’s a rundown: LaTeX is the stand...

Read more »

Using Text Mining to Find Out What @RDataMining Tweets are About

November 8, 2011
By
Using Text Mining to Find Out What @RDataMining Tweets are About

This post shows an example on text mining of Twitter data with R packages twitteR, tm and wordcloud. Package twitteR provides access to Twitter data, tm provides functions for text mining, and wordcloud visualizes the result with a word cloud. … Continue reading →

Read more »

readGrads – An R package to read and manipulate grads data

November 8, 2011
By

I created an R package to read grads data. As far as I know, there is no dedicated package to read grads data. The package is still quite new, any remarks on the documentation or code are more than welcome.… See more ›

Read more »

Setting up AWS Cluster to use snow in R

November 8, 2011
By
Setting up AWS Cluster to use snow in R

Setting up AWS Cluster I wanted to setup an AWS cluster to take a shot at a Kaggle contest – DunnHumby Challenge http://www.kaggle.com/c/dunnhumbychallenge For this, I found StarCluster to be of great help. It allows you to set-up AWS nodes in a few lines of code and does much more (choosing AMIs and cluster configurations)

Read more »

Web Scraping Google Scholar (Partial Success)

November 8, 2011
By

I wanted to scrape the information returned by a Google Scholar web search into an R data frame as a quick XPath exercise. The following will successfully extract  the ‘title’, ‘url’ , ‘publication’ and ‘description’.  If any of these fields are not available, as in the case of a citation, the corresponding cell in the data

Read more »

Bridge and Torch problem in R

November 8, 2011
By
Bridge and Torch problem in R

A couple months ago I came across the bridge and torch problem at a careers fair in Oxford. A young tech company called QuBit used it as a brain teaser challenge for would be software engineers to solve before submitting … Continue reading →

Read more »