CRU Data in RghcnV3

July 31, 2011

(This article was first published on Steven Mosher's Blog, and kindly contributed to R-bloggers)

As many have noted CRU have posted their data a bit ago and the usual gang has started to put it through the various “engines” for calculating a global temperature index. Thanks to others who worked this problem ahead of me getting the data in Rghcnv3 was not that hard, but it was not without it’s challenges given the format. On the CRU site the format consists of a file with two types of  ”lines” in it. Header lines that give a station metadata, followed by data “lines” that contain the data for the station.  That entailed doing some ugly things to read the data in, so I tried to make the ugliness somewhat clearer.  The code:

readCruStations <- function(filename=”stations-data.txt”, output = c(“Array”,”Zoo”,”Mts”)){

if (length(output) > 1) {
warning(“Select One of either V3 or ARRAY or Mts. Using first element by default”)
returnType <- output[1]
} else {
returnType <- output
X <- readLines(“station-data.txt”)

# suck out the headers and work on them to make an inventory
dex <- which(nchar(X[]) == 72)
headers <- X[dex]

#  build an inventory that RghcnV3 likes!

Inv <- data.frame(Id = as.numeric(substr(headers,1,6)),
Lat = as.numeric(substr(headers,7,10)),
Lon = as.numeric(substr(headers,11,15)),
Altitude = as.numeric(substr(headers,16,20)),
Name = substr(headers,22,42),
Country = substr(headers,43,56),
StartYear = as.numeric(substr(headers,57,60)),
EndYear = as.numeric(substr(headers,61,64)),
Source = as.numeric(substr(headers,67,68)),
GoodFirst = as.numeric(substr(headers,69,72)))

Inv$Lat[Inv$Lat == -999] <- NA
Inv$Lon[Inv$Lon == -1999] <- NA
Inv$Lat <- Inv$Lat / 10
Inv$Lon <- Inv$Lon / 10

Inv$Altitude[Inv$Altitude == -999] <- NA

begin <- min(Inv$StartYear)
end <- max(Inv$EndYear)
Years <- begin:end
stations <- nrow(Inv)

##  build a data array that RghcnV3 likes !
DATA <- array(NA,dim=c(stations,12,length(Years)))

stationCount <- 0
for(line in 1:length(X)){
if (nchar(X[line]) == 72){
# we have a header. increment station counter ( its an array index)
stationCount <- stationCount + 1
} else {
# we have a data record.  use the regular expression  (” +”) to split the string into numbers
thisLine <- as.numeric(unlist(strsplit(X[line],” +”)))
thisLine[thisLine == -999] <-NA
thisYear <- thisLine[1] – begin + 1
DATA[stationCount,,thisYear] <- thisLine[2:13]/10

## Prepare for output
if (returnType == “Mts” | returnType == “Zoo”){
DATA <- apply(DATA,MARGIN = 1, FUN = c)
DATA <- ts(DATA, start = minYear, frequency = 12)
if (returnType == “Zoo”) DATA <- asZoo(DATA)
if (returnType == “Mts”){
return(list(Inventory = Inv,Mts = DATA))

if (returnType == “Zoo”){
return(list(Inventory = Inv,Zoo = DATA))
if (returnType == “Array”){
return(list(Inventory = Inv,Array = DATA))



Of course, then it is a simple matter to load that data into a raster and compute the area averaged global temperature:  What’s it show? Given CRU’s source data and given an algorithm that is patterned after their’s, They are cooler. Perhaps, they posted only the source data and not any adjustments they do.  These results match Nick Stokes who is also a bit warmer than CRU.

To leave a comment for the author, please follow the link and comment on their blog: Steven Mosher's Blog. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...


Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)