[This article was first published on Steven Mosher's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

UPDATE: Ron Broberg has a more definitive explanation of the difference which indicates that 5sig issue is not the main cause of the difference. See his exposition here

CRU, it appears, trims out station data when it lies outside 5 sigma. Well, for certain years where there was actually record cold weather that leads to discrepancies between CRU and me. probably happens in warm years as well. Overall this trimming of data amounts to around .1C. ( mean of all differences)

below, see what 1936 looked like. Average for every month, max anomaly, min anomaly, and 95% CI (orange) And note these are actual anomalies from 1961-90 baseline. So thats a -21C departure from the average.  With a sd around 2.5 that means CRU is trimming  departures greater than 13C or so.  A simple look at the data showed bitterly cold  weather in the US. Weather that gets snipped by a 5 sigma trim.

And  More interesting facts: If one throws out data because of outlier status one can expect outliers to be uniformly distributed over the months. In other words bad data has no season. So, I sorted the ‘error’ between CRU and Moshtemp. Where do we differ. Uniformly over the months? or does the dropping of 5sigma events happen in certain seasons. First lets look at when CRU is warmer than Moshtemp. I take the top 100 months in terms of positive error. Months here are expressed as fractions 0= jan

Next, we take the top 100 months in terms of negative error. Is that uniformly distributed?

If this data holds up upon further examination it would appear that CRU processing has a seasonal Bias, really cold winters and really warm winters ( 5 sigma events) get tossed. Hmm.

The “delta” between Moshtemp and CRU varies with the season. The worst months on average are dec/jan. The sd for the winter month delta is twice that of other months. Again, if these 5 sig events were just bad data we would not expect this. Over all Moshtemp is warmer that CRU, but  when we look at TRENDS it matters where these events happen

To leave a comment for the author, please follow the link and comment on their blog: Steven Mosher's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)