I’d be more than happy with the unlinked data web

April 14, 2010
By

(This article was first published on What You're Doing Is Rather Desperate » R, and kindly contributed to R-bloggers)

Visit this URL and you’ll find a perfectly-formatted CSV file containing information about recent earthquakes. A nice feature of R is the ability to slurp such a URL straight into a data frame:

quakes <- read.csv("http://neic.usgs.gov/neis/gis/qed.asc", header = T)
colnames(quakes)
# [1] "Date"      "TimeUTC"   "Latitude"  "Longitude" "Magnitude" "Depth"
# number of recent quakes
nrow(quakes)
# [1] 3135
# biggest recent quake
subset(quakes, quakes$Magnitude == max(quakes$Magnitude, na.rm = T))
#            Date    TimeUTC Latitude Longitude Magnitude Depth
# 2060 2010/02/27 06:34:14.0  -35.993   -72.828       8.8    35

I hear a lot about the “web of data” and the “linked data web” but honestly, I’ll be happy the day people start posting data as delimited, plain text instead of HTML and PDF files.


Filed under: programming, R, research diary, statistics, web resources Tagged: open data, www

To leave a comment for the author, please follow the link and comment on his blog: What You're Doing Is Rather Desperate » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , ,

Comments are closed.