Working with hdf files in R – Example: Pathfinder SST data

November 8, 2013
By

(This article was first published on me nugget, and kindly contributed to R-bloggers)



Following  a question that I posted on stackoverflow.com, I recieved the great advice to use the Bioconductor rhdf5 package to work with HDF5 files. The package is not located on CRAN, but can be sourced from the Bioconductor website:

source("http://bioconductor.org/biocLite.R")
biocLite("rhdf5")
Created by Pretty R at inside-R.org

As an example, I use the package to extract Pathfinder sea surface temperature (SST) data, available in netCDF-4 format (the features of netCDF-4 are a subset of the features of HDF5). This type of file is not readable by the netCDF package ncdf.The result is the above plot of a subarea from one of the daily data sets.

To reproduce the figure, you will need the image.scale and val2col functions found on this blog.

To reproduce example:

###required packages
library("rhdf5")
library(maps)
 
###required functions
source("image.scale.R") #http://menugget.blogspot.de/2011/08/adding-scale-to-image-plot.html
source("val2col.R") # http://menugget.blogspot.de/2011/09/converting-values-to-color-levels.html
 
 
###load data
#AVHRR Pathfinder Version 5.2 data (link: "http://www.nodc.noaa.gov/SatelliteData/pathfinder4km/")
fname <- "20120103133752-NODC-L3C_GHRSST-SSTskin-AVHRR_Pathfinder-PFV5.2_NOAA19_G_2012003_day-v02.0-fv01.0.nc" #(from Pathfinder ftp: "ftp://ftp.nodc.noaa.gov/pub/data.nodc/pathfinder/Version5.2/2012/")
 
#List the content of the HDF5 file.
tmp <- h5ls(fname)
tmp
 
lon <- h5read(fname, "lon")
lat <- h5read(fname, "lat")
sst <- h5read(fname, "sea_surface_temperature")
dim(sst)
 
#crop to desired area
lonRan <- c(-140, -60)
latRan <- c(20, 50)
lonKeep <- which(lon > lonRan[1] & lon < lonRan[2])
latKeep <- which(lat> latRan[1] & lat< latRan[2])
 
sst2 <- sst[lonKeep, latKeep, 1]
range(sst2)
sst2 <- replace(sst2, sst2 == -32768, NaN)
range(sst2, na.rm=TRUE)
sst2 <- (sst2 + 273.15) * 0.01 # change from deg Kelvin to deg Celsius and scale by 0.01 (p.45 http://data.nodc.noaa.gov/pathfinder/Version5.2/GDS_TechSpecs_v2.0.pdf)
range(sst2, na.rm=TRUE)
 
lon2 <- lon[lonKeep] # subset of lon values
lat2 <- rev(lat[latKeep]) # subset of lat values , and reverse for increasing values
sst2 <- sst2[,rev(seq(latKeep))] # reverse column order (lat) for increasing values
 
 
###plot
#figure template from "http://menugget.blogspot.de/2013/01/my-template-for-controlling-publication.html"
#Layout of plots
#1 1 3
#1 2 3
LO <- matrix(c(1,2), nrow=1, ncol=2, byrow=TRUE)
LO #double check layout
 
#Resolution, pointsize
RESO <- 400
PS <- 10
 
#Overall units in inches
WIDTHS <- c(5,1) #widths of each figure in layout (i.e. column widths)
HEIGHTS <- c(3.5) #heights of each figure in layout (i.e. row heights)
 
#Outer margins and calculation of full dimensions
OMA <- c(0,0,1,0) #Outer margins c(bottom, left, top, right)
HEIGHT <- sum(HEIGHTS) + OMA[1]*PS*1/72 + OMA[3]*PS*1/72
WIDTH <- sum(WIDTHS) + OMA[2]*PS*1/72 + OMA[4]*PS*1/72
 
#Double check full dimensions
WIDTH; HEIGHT
 
 
#Start plot; open device - run from x11() down to observe behavior; run from pdf() down to produce .pdf
png("sst_usa.png", width=WIDTH, height=HEIGHT, units="in", res=RESO)
#pdf("sst_usa.pdf", width=WIDTH, height=HEIGHT)
#x11(width=WIDTH, height=HEIGHT)
 
par(oma=OMA, ps=PS) #settings before layout
layout(LO, heights=HEIGHTS, widths=WIDTHS)
#layout.show(max(LO)) # run to see layout; comment out to prevent plotting during .pdf
par(cex=1) # layout has the tendency change par()$cex, so this step is important for control
 
#plot 1
par(mar=c(2,2,1,1))
pal <- colorRampPalette(c("blue", "cyan", "yellow", "red"))
image(lon2, lat2, sst2, col=pal(100), xlab="", ylab="")
map("world", add=TRUE, fill=TRUE, col=8)
mtext("Pathfinder SST (2012-01-03)", side=3, line=0.5, font=2)
box()
 
#plot 2
par(mar=c(2,0,1,4))
image.scale(sst2, col=pal(100), xlab="", ylab="", axes=FALSE, horiz=FALSE)
axis(4)
mtext("deg [C°]", side=4, line=2.5)
box()
 
dev.off() # closes device
Created by Pretty R at inside-R.org


To leave a comment for the author, please follow the link and comment on his blog: me nugget.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.