UPDATE: Ron Broberg has a more definitive explanation of the difference which indicates that 5sig issue is not the main cause of the difference. See his exposition here
A short update. I’m in the process of integration the Land Analysis and the SST analysis into one application. The principle task in front of me is integrating some new capability in the ‘raster’ package. As that effort proceeds I continue to check against prior work and against the accepted ‘standards’. So, I reran the Land analysis and benchmarked against CRU. Using the same database, the same anomaly period, and the same CAM criteria. That produced the following
My approach shows a lot more noise. Something not seen in the SST analysis which matched nicely. Wondering if CRU had done anything else I reread the paper.
” Each grid-box value is the mean of all available station anomaly values, except that station outliers in excess of ﬁve standard deviations are omitted.”
I dont do that! Curious, I looked at the monthly data:
The Month were CRU and I differ THE MOST is Feb, 1936.
lets look at the whole year of 1936
 -0.708 -0.303 -0.330 -0.168 -0.082 0.292 0.068 -0.095 0.009 0.032 0.128 -0.296
 “-0.328″ “-2.575″ “0.136″ ”-0.55″ ”0.612″ ”0.306″ ”1.088″ ”0.74″ “0.291″ ”-0.252″ “0.091″ ”0.667″
So feb 1936 sticks out as a big issue.
Turning to the anomaly data for 1936. here is what we see in UNWEIGHTED Anomalies for the entire year
Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s
-21.04000 -1.04100 0.22900 0.07023 1.57200 13.75000 31386.00000
The issue when you look at the detailed data is for example some record cold in the US. 5 sigma type weather.
Looking through the data you will find that in the US you have feb anomalies beyond the 5 sig mark with some regularity. And if you check google, of course it was a bitter winter. Just an example below. Much more digging is required here and other places where the method of tossing out 5 sigma events appears to cause differences(in apparently both directions). So, no conclusions yet, just a curious place to look. More later as time permits. If you’re interested double check these results.
To leave a comment
for the author, please follow the link and comment on their blog: Steven Mosher's Blog
offers daily e-mail updates
news and tutorials
on topics such as: Data science
, Big Data, R jobs
, visualization (ggplot2
), programming (RStudio
, Web Scraping
) statistics (regression
, time series
) and more...
If you got this far, why not subscribe for updates
from the site? Choose your flavor: e-mail
, or facebook