Towards the R package sheldus, Part 1: Natural Disaster Losses in the US in 2012

[This article was first published on Nine Lives, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The SHELDUS database, short for Spatial Hazard Events and Losses Database in the United States (http://webra.cas.sc.edu/hvri/products/sheldus.aspx), from the University of South Carolina, is a  database on human and property losses from natural disasters in the United States. Data from this database includes County-level information on property losses, crop losses, injuries and fatalities from 18 different types of natural hazards (hurricanes, droughts, floods, etc.) from about 1960 to the present.

Pros
  • Data is free
  • Inflation adjusted losses are also available
Cons
  • Downloading the approximately 200 MB data (as of Nov 2013) from the GUI is tedious. There does not seem to be an easy way to download the entire data all at once. 
  • Currently only one reference year could be chosen for inflation adjustment. What if someone wanted multiple years or wanted to update their data next year – they would have to download the entire data set again through the clunky GUI!
  • Sharing the data and analysis of the entire data is not easy because of its size and layout.
Goal
  • Build an R package which would come with the entire SHELDUS data. 
  • Create functions to display and analyze the data.  
  • [future maybe] Combine this with other disaster damage info (e.g., FEMA’s NFIP –https://github.com/RationShop/nfip).
Status/TODOs
  • Most of the code for cleaning and formatting the data, IO and graphics is ready.
  • Instead of plain text, I use the binary format (and a few tricks) reducing the data size to 30 MB (from ~ 200 MB!).
  • [TODO] Ability to retrieve data for multiple years/perils at once.
  • [TODO] Code for inflation adjustment.
  • [TODO] Presidential Disaster Declarations data and data prior to 1960 needs to be included.
I will be revising the code towards building a complete R package. Along my way I will be doing several QA/QC checks and will be posting on my analyses.
    In this first post, I will be looking at losses in 2012. All the graphics and code are available at my GitHub site – https://github.com/RationShop/sheldus




    Any help or comments appreciated.

    To leave a comment for the author, please follow the link and comment on their blog: Nine Lives.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

    Never miss an update!
    Subscribe to R-bloggers to receive
    e-mails with the latest R posts.
    (You will not see this message again.)

    Click here to close (This popup will not appear again)