Site icon R-bloggers

Working with hdf files in R – Example: Pathfinder SST data

[This article was first published on me nugget, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


Following  a question that I posted on stackoverflow.com, I recieved the great advice to use the Bioconductor rhdf5 package to work with HDF5 files. The package is not located on CRAN, but can be sourced from the Bioconductor website:

source("http://bioconductor.org/biocLite.R")
biocLite("rhdf5")


As an example, I use the package to extract Pathfinder sea surface temperature (SST) data, available in netCDF-4 format (the features of netCDF-4 are a subset of the features of HDF5). This type of file is not readable by the netCDF package ncdf.The result is the above plot of a subarea from one of the daily data sets.

To reproduce the figure, you will need the image.scale and val2col functions found on this blog.

To reproduce example:

###required packages
library("rhdf5")
library(maps)
 
###required functions
source("image.scale.R") #http://menugget.blogspot.de/2011/08/adding-scale-to-image-plot.html
source("val2col.R") # http://menugget.blogspot.de/2011/09/converting-values-to-color-levels.html
 
 
###load data
#AVHRR Pathfinder Version 5.2 data (link: "http://www.nodc.noaa.gov/SatelliteData/pathfinder4km/")
fname <- "20120103133752-NODC-L3C_GHRSST-SSTskin-AVHRR_Pathfinder-PFV5.2_NOAA19_G_2012003_day-v02.0-fv01.0.nc" #(from Pathfinder ftp: "ftp://ftp.nodc.noaa.gov/pub/data.nodc/pathfinder/Version5.2/2012/")
 
#List the content of the HDF5 file.
tmp <- h5ls(fname)
tmp
 
lon <- h5read(fname, "lon")
lat <- h5read(fname, "lat")
sst <- h5read(fname, "sea_surface_temperature")
dim(sst)
 
#crop to desired area
lonRan <- c(-140, -60)
latRan <- c(20, 50)
lonKeep <- which(lon > lonRan[1] & lon < lonRan[2])
latKeep <- which(lat> latRan[1] & lat< latRan[2])
 
sst2 <- sst[lonKeep, latKeep, 1]
range(sst2)
sst2 <- replace(sst2, sst2 == -32768, NaN)
range(sst2, na.rm=TRUE)
sst2 <- (sst2 + 273.15) * 0.01 # change from deg Kelvin to deg Celsius and scale by 0.01 (p.45 http://data.nodc.noaa.gov/pathfinder/Version5.2/GDS_TechSpecs_v2.0.pdf)
range(sst2, na.rm=TRUE)
 
lon2 <- lon[lonKeep] # subset of lon values
lat2 <- rev(lat[latKeep]) # subset of lat values , and reverse for increasing values
sst2 <- sst2[,rev(seq(latKeep))] # reverse column order (lat) for increasing values
 
 
###plot
#figure template from "http://menugget.blogspot.de/2013/01/my-template-for-controlling-publication.html"
#Layout of plots
#1 1 3
#1 2 3
LO <- matrix(c(1,2), nrow=1, ncol=2, byrow=TRUE)
LO #double check layout
 
#Resolution, pointsize
RESO <- 400
PS <- 10
 
#Overall units in inches
WIDTHS <- c(5,1) #widths of each figure in layout (i.e. column widths)
HEIGHTS <- c(3.5)  #heights of each figure in layout (i.e. row heights)
 
#Outer margins and calculation of full dimensions
OMA <- c(0,0,1,0)  #Outer margins c(bottom, left, top, right)
HEIGHT <- sum(HEIGHTS) + OMA[1]*PS*1/72 + OMA[3]*PS*1/72
WIDTH <- sum(WIDTHS) + OMA[2]*PS*1/72 + OMA[4]*PS*1/72
 
#Double check full dimensions
WIDTH; HEIGHT 
 
 
#Start plot; open device - run from x11() down to observe behavior; run from pdf() down to produce .pdf
png("sst_usa.png", width=WIDTH, height=HEIGHT, units="in", res=RESO)
#pdf("sst_usa.pdf", width=WIDTH, height=HEIGHT)
#x11(width=WIDTH, height=HEIGHT)
 
par(oma=OMA, ps=PS) #settings before layout
layout(LO, heights=HEIGHTS, widths=WIDTHS)
#layout.show(max(LO)) # run to see layout; comment out to prevent plotting during .pdf
par(cex=1) # layout has the tendency change par()$cex, so this step is important for control
 
#plot 1
par(mar=c(2,2,1,1))
pal <- colorRampPalette(c("blue", "cyan", "yellow", "red"))
image(lon2, lat2, sst2, col=pal(100), xlab="", ylab="")
map("world", add=TRUE, fill=TRUE, col=8)
mtext("Pathfinder SST (2012-01-03)", side=3, line=0.5, =2)
box()
 
#plot 2
par(mar=c(2,0,1,4))
image.scale(sst2, col=pal(100), xlab="", ylab="", axes=FALSE, horiz=FALSE)
axis(4)
mtext("deg [C°]", side=4, line=2.5)
box()
 
dev.off() # closes device



To leave a comment for the author, please follow the link and comment on their blog: me nugget.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.