rgdal + raster + RCurl = My next package

[This article was first published on Steven Mosher's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This package has been a long time in the making.  In the end it’s more of a data package than a functional package, but pulling all the pieces together required me to learn some really cool packages: raster ( which I already knew ) rgdal and RCurl.  I’ll provide a littel bit of an overview of what comes together on this package and why I built it, and then maybe some future directions.

In the course of looking at climate stations and the question of UHI I got kinda fascinated with the question of metadata: the variables that describe a stations location and physical features.  The goal of the project was to assemble a somewhat complete set of datasets to describe the geographical conditions of  any climate station.  I set out some requirements, first and foremost I wanted to use data that was open and not behind registration walls. That way I can write code to simply go get the data, download it and unpack it for the package user. I failed. In the end I ended up with a few datasets that require minimal registration headaches. I think with some help from RCurl experts I could tackle most of those issues. we will see.  Lets start by canvasing all the datasets I collected

1. GHRSST distance from coast dataset.  This 1km resolution file gives you the distance from the coast for all bodies of water.

2. NASA distance from Coast. This 1km dataset gives you distance from coast for  sea pixels only.

Dataset 1 and 2 along with some other files can be used to create a 1km land mask. Here I use them to tell if a station is on the coast or not.

3. Nightlights data.  radiance calibrated nightlights from  DSMP. a 1km dataset of radiance calibrated nightlights. This is the latest and greatest nightlights data with a much larger dynamic range than other products.

4. Impervious surface data. another 1km data product of impervious surfaces

5. Airports. 40K plus airport locations are rasterized into a 1km raster.  When I get time I’ll turn this into a distance map ( one call in raster) so that each cell contains distance to the nearest airport.

6. Harmonized Land Use data.  The package build a raster brick from 7 land use files  at 5 minutes resolution: urban land, cultivated land,irrigated land, rain irrigated land, forest, sparse vegetation, grassland.

7. Bluewater irrigation in 5 minute resolution: The amount of bluewater used for irrigation on the croplands in the grid.

8. Modis urban extent. 500 meter data. This file  requires registration and permission from the PI. Its freely given and I really wanted to use the new Moid data

7. Landcover. This is a previous Modis product that also requires registration, but its painless. Also cool because the data is in 72 tiles

8. Hyde population. Historical population density in 5 minute resolution. Also rural population counts and urban population counts. This requires regsitration

9. GPW population. Gridded population of the world in 2.5 minute resolution. Requires registration.

10. Grump urban extent. Urban extent at 1km resolution. requires registration.


With the exception of Modis I’m pretty sure if I were better with RCurl I could get those to download by filling in the registration forms programatically. Hyde is also easy except the server likes to download incomplete files and I should probably work on that a bit. For now, you have to download a few of the files manually.

The package has just a couple core functions that are tied to these files. The functions are primarily for doing complilations of datasets. The first function createRasters() just does the work of reading in files and creating native raster files. So for Hyde, ISA, landcover, airports, and Modis, I found it beneficial to reclass some of the datasets and save them as native rasters. That function takes a while to run but you only run it once. The next function if collateMetadata(). This function takes a dataframe of lat/lons and attaches metadata for every position. Obviously of you know raster you dont need any of this as you can just use the raster “extract” function to pull metadata from any of the assets.

test code just finished and it’s headed to CRAN in the morning.

Must sleep




To leave a comment for the author, please follow the link and comment on their blog: Steven Mosher's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)