How to access 100M time series in R in under 60 seconds

August 25, 2011
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

DataMarket, a portal that provides access to more than 14,000 data sets from various public and private sector organizations, has more than 100 million time series available for download and analysis. (Check out this presentation for more info about DataMarket.) And now with the new package rdatamarket, it's trivially easy to import those time series into R for charting, analysis, or anything. Here's what you need to do:

  1. Register an account on DataMarket.com (it's free)
  2. Install the rdatamarket package in R with install.packages("rdatamarket")
  3. Browse DataMarket.com for a time series of interest (I found this series on unemployment)
  4. Copy the URL of the page you're on (the short URL works too, I used "http://data.is/qb61uf")
  5. Use the dmseries function with the URL to extract the time series as a zoo object

Here's an example:

> library(rdatamarket)
> dminfo("http://data.is/qb61uf")
Title: "Persons Unemployed 15 weeks or longer, as a percent of the civilian labor force"
Provider: "Federal Reserve Bank of St. Louis" (citing "U.S. Department of Labor: Bureau of Labor Statistics")
Dimensions:
> unemp <- dmseries("http://data.is/qb61uf")
> plot(unemp)
> str(unemp)zoo’ series from Jan 1948 to Jul 2011
  Data: num [1:763, 1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:763] "1" "2" "3" "4" ...
  ..$ : chr "Persons.Unemployed.15.weeks.or.longer..as.a.percent.of.the.civilian.labor.force"
  Index: Class 'yearmon'  num [1:763] 1948 1948 1948 1948 1948 ...

Created by Pretty R at inside-R.org

US Unemployment

With this package, you can go from finding interesting data on DataMarket to working with it in R in less than a minute. With such a wealth of data so easily available to the power of R, this will be a fantastic tool for all data scientists and data journalists.

CRAN: rdatamarket package

To leave a comment for the author, please follow the link and comment on his blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.