Cherry Picking to Generalize ~ NASA Global Temperature Trends ~ enhanced w/ ggplot2

[This article was first published on mind of a Markov chain » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In a prior article, I tried to visualize the linear global temperatures trends for a grid of start and end years. The visual I created was confusing in that the specification of color scale was interdependent with the data values. I wanted a blue -> white -> red scale of the temperatures indicating cool -> neutral -> warm temperatures, with the extremums using the same “darkness” (e.g. +3C will be rgb(1,0,0) and -3C will be rgb(0,0,1)). But the maximum temperatures were higher than the minimum, so I had to artificially include the negative of the maximum temp. in the lower triangle of the data (or else 0C won’t be white).

I have recreated this plot in the ggplot2 package because of the handy scale_colour_gradient2 function to set the color scale, and the ability to elegantly extend the features of the graph by calling the existing data frame.

The graph now adds identification of statistically insignificant temperature slopes marked as “x”. After setting the color scale, getting rid of the extra space in the panel, adding a stair graph (geom_step) and changing colors of the panel and background, I have the following:

None of the blue colored boxes are statistically significant at the 5% level.

code:

 # from NASA GISS
# http://data.giss.nasa.gov/gistemp/graphs/Fig.A2.txt
GISTEMP <- read.csv("Global Annual Mean Surface Air Temperature Change.csv")

head(GISTEMP, 2)
#   Year Annual_Mean X5_Year_Mean
# 1 1880       -0.28           NA
# 2 1881       -0.21           NA

# take lm() of starting point vs. end point back to 1950
begin = 1970
finish = 2009
min.lag = 4 # minimum lag years between begin and finish

lm.len <- length(begin:finish) - min.lag
lm.yr <- subset(GISTEMP, (Year >= begin) & (Year <= finish), select = Year)
lm.dex <- which(GISTEMP[,1] == begin):which(GISTEMP[,1] == finish)
lm.temp <- matrix(NA, lm.len, lm.len) # slope of temp
lm.p <- matrix(NA, lm.len, lm.len) # p-value of slope
rownames(lm.temp) <- begin:(finish-min.lag)
colnames(lm.temp) <- (begin+min.lag):finish

for (i in 1:lm.len) {
	for (j in (i + min.lag):(lm.len+min.lag)) {
		lm.dat <- lm(GISTEMP[lm.dex[i:j],2] ~ GISTEMP[lm.dex[i:j],1])
		lm.temp[i,(j-min.lag)] <- lm.dat$coeff[2] # slope parameter!
		lm.p[i,(j-min.lag)] <- summary(lm.dat)$coeff[2,4] # p-value of slope!
	}
}

library(ggplot2)

lm.sig <- lm.p < .05 # which are significant

lm.ggframe <- data.frame(expand.grid(Start = as.numeric(rownames(lm.temp)),
			End = as.numeric(colnames(lm.temp))), Temp = as.vector(lm.temp), Sig = as.vector(lm.sig))

max.temp <- max(as.vector(lm.temp), na.rm = T)
min.temp <- abs(min(as.vector(lm.temp), na.rm = T))

# set data, x-y axis
p <- ggplot(lm.ggframe, aes(x=Start, y=End))
# set temps
p <- p + geom_tile(aes(fill = Temp), colour = "black")
# set colors of gradient
p <- p + scale_fill_gradient2(low = rgb(0,0,min.temp/max.temp), mid = "white", high = rgb(1,0,0),
					name = "TemperaturenSlope")
# get rid of extra space and populate axis labels
p <- p + scale_x_continuous(expand = c(0,.2), breaks = begin:(finish-min.lag)) +
	scale_y_continuous(expand = c(0,.2), breaks = (begin+min.lag):finish)
# rotate appropriately
p <- p + opts(axis.text.x=theme_text(angle=-90, hjust=0), axis.text.y = theme_text(angle=0, hjust = 1),
	title = paste("NASA Global Temp Slope (Celsius/year) with Lag =", min.lag, "years"))
# change color of grid and panel background
p <- p + opts(panel.background = theme_rect(fill = "white", size = 2),
	panel.grid.major = theme_line(colour = "grey80", linetype = "dotted", size = .5))
# add stairs to make white clearer along diagonal
p <- p + geom_step(aes(x = begin:(finish-min.lag)-.5, y = (begin+min.lag):(finish)-.5))
# overlay statistical significance
p <- p + geom_point(aes(x = Start, y = End, shape = Sig), data = subset(lm.ggframe, Sig == F)) +
	scale_shape(name = "Sig. Levelnp < .05") + scale_shape_manual(value=c(4))

p


Filed under: Climate Change, ggplot2, R

To leave a comment for the author, please follow the link and comment on their blog: mind of a Markov chain » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)