Working with hdf files in R – Example: Pathfinder SST data

November 8, 2013

(This article was first published on me nugget, and kindly contributed to R-bloggers)

Following  a question that I posted on, I recieved the great advice to use the Bioconductor rhdf5 package to work with HDF5 files. The package is not located on CRAN, but can be sourced from the Bioconductor website:


As an example, I use the package to extract Pathfinder sea surface temperature (SST) data, available in netCDF-4 format (the features of netCDF-4 are a subset of the features of HDF5). This type of file is not readable by the netCDF package ncdf.The result is the above plot of a subarea from one of the daily data sets.

To reproduce the figure, you will need the image.scale and val2col functions found on this blog.

To reproduce example:

###required packages
###required functions
source("image.scale.R") #
source("val2col.R") #
###load data
#AVHRR Pathfinder Version 5.2 data (link: "")
fname <- "" #(from Pathfinder ftp: "")
#List the content of the HDF5 file.
tmp <- h5ls(fname)
lon <- h5read(fname, "lon")
lat <- h5read(fname, "lat")
sst <- h5read(fname, "sea_surface_temperature")
#crop to desired area
lonRan <- c(-140, -60)
latRan <- c(20, 50)
lonKeep <- which(lon > lonRan[1] & lon < lonRan[2])
latKeep <- which(lat> latRan[1] & lat< latRan[2])
sst2 <- sst[lonKeep, latKeep, 1]
sst2 <- replace(sst2, sst2 == -32768, NaN)
range(sst2, na.rm=TRUE)
sst2 <- (sst2 + 273.15) * 0.01 # change from deg Kelvin to deg Celsius and scale by 0.01 (p.45
range(sst2, na.rm=TRUE)
lon2 <- lon[lonKeep] # subset of lon values
lat2 <- rev(lat[latKeep]) # subset of lat values , and reverse for increasing values
sst2 <- sst2[,rev(seq(latKeep))] # reverse column order (lat) for increasing values
#figure template from ""
#Layout of plots
#1 1 3
#1 2 3
LO <- matrix(c(1,2), nrow=1, ncol=2, byrow=TRUE)
LO #double check layout
#Resolution, pointsize
RESO <- 400
PS <- 10
#Overall units in inches
WIDTHS <- c(5,1) #widths of each figure in layout (i.e. column widths)
HEIGHTS <- c(3.5) #heights of each figure in layout (i.e. row heights)
#Outer margins and calculation of full dimensions
OMA <- c(0,0,1,0) #Outer margins c(bottom, left, top, right)
HEIGHT <- sum(HEIGHTS) + OMA[1]*PS*1/72 + OMA[3]*PS*1/72
WIDTH <- sum(WIDTHS) + OMA[2]*PS*1/72 + OMA[4]*PS*1/72
#Double check full dimensions
#Start plot; open device - run from x11() down to observe behavior; run from pdf() down to produce .pdf
png("sst_usa.png", width=WIDTH, height=HEIGHT, units="in", res=RESO)
#pdf("sst_usa.pdf", width=WIDTH, height=HEIGHT)
#x11(width=WIDTH, height=HEIGHT)
par(oma=OMA, ps=PS) #settings before layout
layout(LO, heights=HEIGHTS, widths=WIDTHS) # run to see layout; comment out to prevent plotting during .pdf
par(cex=1) # layout has the tendency change par()$cex, so this step is important for control
#plot 1
pal <- colorRampPalette(c("blue", "cyan", "yellow", "red"))
image(lon2, lat2, sst2, col=pal(100), xlab="", ylab="")
map("world", add=TRUE, fill=TRUE, col=8)
mtext("Pathfinder SST (2012-01-03)", side=3, line=0.5, font=2)
#plot 2
image.scale(sst2, col=pal(100), xlab="", ylab="", axes=FALSE, horiz=FALSE)
mtext("deg [C°]", side=4, line=2.5)
box() # closes device

To leave a comment for the author, please follow the link and comment on their blog: me nugget. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)