Webscraping using readLines and RCurl
There is a massive amount of data available on the web. Some of it is in the form of precompiled, downloadable datasets which are easy to access. But the majority of online data exists as web content such as blogs, news stories and cooking recipes. With precompiled files, accessing the data is fairly straightforward; just download the file, unzip if necessary, and import into R. For “wild” data however, getting the data into an analyzeable format is more difficult. Accessing online data of this sort is sometimes reffered to as “webscraping”.
To leave a comment
for the author, please follow the link and comment on his blog: Programming R
offers daily e-mail updates
news and tutorials
on topics such as: visualization (ggplot2
), programming (RStudio
, Web Scraping
) statistics (regression
, time series
) and more...
If you got this far, why not subscribe for updates
from the site? Choose your flavor: e-mail
, or facebook