**Peter's stats stuff - R**, and kindly contributed to R-bloggers)

A number of data visualisations are circulating showing the disturbing rise in temperature at the North Pole and drop in coverage of Arctic sea ice. The current level of interest is credited to a tweet from Zack Labe, whose Twitter page is a great source of interesting visualisations on sea ice. Secondary examples chosen more or less at random include James Renwick’s daily tweet which also combines the Arctic and Antarctic extent, the Washington Post, and the Economist. We even have ‘meta’ articles like this really useful one in The Verge.

I wanted to familiarise myself with the data – published by the USA’s National Snow and Ice Data Center – to see if the presentations are reasonable and faithful (spoiler – they are). As there’s some mild controversy about adding together the Antarctic and Arctic sea ice levels I focused on just one and chose the Arctic. I think “sea ice” is an easier concept to deal with in the Arctic, without the complication of ice-covered polar land mass in the south.

## Improving a graphic

Here’s my own graph of the data:

Some of the improvements I’ve attempted from the many – all of them excellent – prior versions include:

- Include each individual year, not just a summary of +/- two standard deviations and 2012 picked out. Many of the graphics have a grey area showing the “usual” past distribution, but miss the chance to show the steady downwards decline over the years.
- Colour code years in a sequential fashion. Versions that I’ve seen that colour code the years do so in a way that isn’t easy to follow the progress of time (eg rainbow colours), and I chose instead the carefully crafted viridis colours scheme.
- Carefully design the legend to make it easy for the eye to follow the downwards pattern in time – there’s no point in having each individual colour for 30+ years listed in the legend.
- Include all the data for each year, not just five or so months (a problem with a couple of the circulating examples).
- Include as many years’ data as are available since regular estimates are available (bi-monthly at first, then monthly).
- Careful use of transparency on all the years other than the latest year, to make it stand out.
- A bit more detail on the source of the data on the graphic itself.
- Minimal use of tick marks, axes, etc.

Not big things, and many of the graphics out there already have some of these, but the careful combination of them all makes a nice graphic and did help me get my head around the data. The data are well curated, uncontroversial in themselves, and readily available. The R code at the bottom of this post should work end to end for downloading the data and reproducing the image.

## Time series analysis

Presented with decades worth of daily data, I couldn’t resist pulling out the time series analytical toolkit. To start with, here’s a seasonal decomposition down with the help of Cleveland’s `stl`

loess-based seasonal decomposition, and my `ggseas`

package which integrates it into a `ggplot2`

graphic workflow. The `stl`

algorithm struggles to distinguish between trend and randomness for the first year of the data, then picks up a compelling story:

The downwards trend in the last few decades is if anything even more prominent in this graphic.

I also did a two year forecast using the seasonal auto-regressive integrated moving average (ARIMA) modelling approach. I adapt the approach set out by Rob Hyndman in a post on forecasting daily data, of using a regular Fourier cycle of artificial variables rather than relying on lagging variables by 365.25 days. This seems to work ok and gets plausible if dispiriting results:

- a likely record low peak Arctic sea ice in March 2017 after the coming cold season
- a non-zero chance that for a brief moment in 2018 there will be zero Arctic sea ice at all.

Ho hum. Melting sea ice doesn’t directly impact on sea level (because the ice is already floating on the water before it melts), but is a huge warning sign of serious climate change happening as we speak. Loss of sea ice reflection of the suns rays is also a warming factor itself (so I understand – I’m an amateur here). Plus it’s sad for the Arctic eco-system.

This blog post was only possible because of open data from the amazing USA National Snow and Ice Data Centre with extensive datasets from NASA and others.

Here’s the R code that did all the above (minus some fonts); it should work out of the box:

**leave a comment**for the author, please follow the link and comment on their blog:

**Peter's stats stuff - R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...