Cherry Picking to Generalize ~ NASA Global Temperature Trends ~ enhanced w/ ggplot2

April 12, 2010
By

(This article was first published on mind of a Markov chain » R, and kindly contributed to R-bloggers)

In a prior article, I tried to visualize the linear global temperatures trends for a grid of start and end years. The visual I created was confusing in that the specification of color scale was interdependent with the data values. I wanted a blue -> white -> red scale of the temperatures indicating cool -> neutral -> warm temperatures, with the extremums using the same “darkness” (e.g. +3C will be rgb(1,0,0) and -3C will be rgb(0,0,1)). But the maximum temperatures were higher than the minimum, so I had to artificially include the negative of the maximum temp. in the lower triangle of the data (or else 0C won’t be white).

I have recreated this plot in the ggplot2 package because of the handy scale_colour_gradient2 function to set the color scale, and the ability to elegantly extend the features of the graph by calling the existing data frame.

The graph now adds identification of statistically insignificant temperature slopes marked as “x”. After setting the color scale, getting rid of the extra space in the panel, adding a stair graph (geom_step) and changing colors of the panel and background, I have the following:

None of the blue colored boxes are statistically significant at the 5% level.

code:

 # from NASA GISS
# http://data.giss.nasa.gov/gistemp/graphs/Fig.A2.txt
GISTEMP <- read.csv("Global Annual Mean Surface Air Temperature Change.csv")

head(GISTEMP, 2)
#   Year Annual_Mean X5_Year_Mean
# 1 1880       -0.28           NA
# 2 1881       -0.21           NA

# take lm() of starting point vs. end point back to 1950
begin = 1970
finish = 2009
min.lag = 4 # minimum lag years between begin and finish

lm.len <- length(begin:finish) - min.lag
lm.yr <- subset(GISTEMP, (Year >= begin) & (Year <= finish), select = Year)
lm.dex <- which(GISTEMP[,1] == begin):which(GISTEMP[,1] == finish)
lm.temp <- matrix(NA, lm.len, lm.len) # slope of temp
lm.p <- matrix(NA, lm.len, lm.len) # p-value of slope
rownames(lm.temp) <- begin:(finish-min.lag)
colnames(lm.temp) <- (begin+min.lag):finish

for (i in 1:lm.len) {
	for (j in (i + min.lag):(lm.len+min.lag)) {
		lm.dat <- lm(GISTEMP[lm.dex[i:j],2] ~ GISTEMP[lm.dex[i:j],1])
		lm.temp[i,(j-min.lag)] <- lm.dat$coeff[2] # slope parameter!
		lm.p[i,(j-min.lag)] <- summary(lm.dat)$coeff[2,4] # p-value of slope!
	}
}

library(ggplot2)

lm.sig <- lm.p < .05 # which are significant

lm.ggframe <- data.frame(expand.grid(Start = as.numeric(rownames(lm.temp)),
			End = as.numeric(colnames(lm.temp))), Temp = as.vector(lm.temp), Sig = as.vector(lm.sig))

max.temp <- max(as.vector(lm.temp), na.rm = T)
min.temp <- abs(min(as.vector(lm.temp), na.rm = T))

# set data, x-y axis
p <- ggplot(lm.ggframe, aes(x=Start, y=End))
# set temps
p <- p + geom_tile(aes(fill = Temp), colour = "black")
# set colors of gradient
p <- p + scale_fill_gradient2(low = rgb(0,0,min.temp/max.temp), mid = "white", high = rgb(1,0,0),
					name = "TemperaturenSlope")
# get rid of extra space and populate axis labels
p <- p + scale_x_continuous(expand = c(0,.2), breaks = begin:(finish-min.lag)) +
	scale_y_continuous(expand = c(0,.2), breaks = (begin+min.lag):finish)
# rotate appropriately
p <- p + opts(axis.text.x=theme_text(angle=-90, hjust=0), axis.text.y = theme_text(angle=0, hjust = 1),
	title = paste("NASA Global Temp Slope (Celsius/year) with Lag =", min.lag, "years"))
# change color of grid and panel background
p <- p + opts(panel.background = theme_rect(fill = "white", size = 2),
	panel.grid.major = theme_line(colour = "grey80", linetype = "dotted", size = .5))
# add stairs to make white clearer along diagonal
p <- p + geom_step(aes(x = begin:(finish-min.lag)-.5, y = (begin+min.lag):(finish)-.5))
# overlay statistical significance
p <- p + geom_point(aes(x = Start, y = End, shape = Sig), data = subset(lm.ggframe, Sig == F)) +
	scale_shape(name = "Sig. Levelnp < .05") + scale_shape_manual(value=c(4))

p


Filed under: Climate Change, ggplot2, R

To leave a comment for the author, please follow the link and comment on his blog: mind of a Markov chain » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.