How to access 100M time series in R in under 60 seconds

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

DataMarket, a portal that provides access to more than 14,000 data sets from various public and private sector organizations, has more than 100 million time series available for download and analysis. (Check out this presentation for more info about DataMarket.) And now with the new package rdatamarket, it's trivially easy to import those time series into R for charting, analysis, or anything. Here's what you need to do:

  1. Register an account on DataMarket.com (it's free)
  2. Install the rdatamarket package in R with install.packages("rdatamarket")
  3. Browse DataMarket.com for a time series of interest (I found this series on unemployment)
  4. Copy the URL of the page you're on (the short URL works too, I used "http://data.is/qb61uf")
  5. Use the dmseries function with the URL to extract the time series as a zoo object

Here's an example:

> library(rdatamarket)
> dminfo("http://data.is/qb61uf")
Title: "Persons Unemployed 15 weeks or longer, as a percent of the civilian labor force"
Provider: "Federal Reserve Bank of St. Louis" (citing "U.S. Department of Labor: Bureau of Labor Statistics")
Dimensions:
> unemp <- dmseries("http://data.is/qb61uf")
> plot(unemp)
> str(unemp)zoo’ series from Jan 1948 to Jul 2011
  Data: num [1:763, 1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:763] "1" "2" "3" "4" ...
  ..$ : chr "Persons.Unemployed.15.weeks.or.longer..as.a.percent.of.the.civilian.labor.force"
  Index: Class 'yearmon'  num [1:763] 1948 1948 1948 1948 1948 ...

US Unemployment

With this package, you can go from finding interesting data on DataMarket to working with it in R in less than a minute. With such a wealth of data so easily available to the power of R, this will be a fantastic tool for all data scientists and data journalists.

CRAN: rdatamarket package

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)