Scraping table from html web with CloudStat

January 12, 2012

(This article was first published on CloudStat, and kindly contributed to R-bloggers)

You need to use the data from internet, but don’t type, you can just extract or scrape them if you know the web URL.

Thanks to XML package from R. It provides amazing readHTMLtable() function.

For a study case,

I want to scrape data:

  1. US Airline Customer Score.
  2. World Top Chess Players (Men).

A. Scraping US Airline Customer Score table from


airline = ‘’
airline.table = readHTMLTable(airline, header=T, which=1,stringsAsFactors=F)


B. Scraping World Top Chess players (Men) table from


chess = ‘’
chess.table = readHTMLTable(chess, header=T, which=5,stringsAsFactors=F)


Done. You had successfully scraping data from any web page with CloudStat.

You can get the full version of this study case (code and result) at Scraping table from html web.

Then, you can analyze as usual! Great! No more retype the data. Enjoy!

To leave a comment for the author, please follow the link and comment on their blog: CloudStat. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)