# Visualize climate anomalies

**R on Dominic Royé**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When we visualize precipitation and temperature anomalies, we simply use time series as bar graph indicating negative and positive values in red and blue. However, in order to have a better overview we need both anomalies in a single graph. In this way we could more easly answer the question of whether a particular season or month was dry-warm or wet-cold, and even compare these anomalies in the context of previous years.

## Packages

In this post we will use the following packages:

Package | Description |
---|---|

tidyverse | Collection of packages (visualization, manipulation): ggplot2, dplyr, purrr, etc. |

lubridate | Easy manipulation of dates and times |

ggrepel | Repel overlapping text labels in ggplot2 |

#we install the packages if necessary if(!require("tidyverse")) install.packages("tidyverse") if(!require("ggrepel")) install.packages("ggrepel") if(!require("lubridate")) install.packages("lubridate") #packages library(tidyverse) library(lubridate) library(ggrepel)

## Preparing the data

First we import the daily precipitation and temperature data from the selected weather station (download). We will use the data from Tenerife South (Spain) [1981-2020] accessible through Open Data AEMET. In R there is a package called `meteoland`

that facilitates the download with specific functions to access data from AEMET (Spanish State Meteorological Agency), Meteogalicia (Galician Meteorological Service) and Meteocat (Catalan Meteorological Service).

### Step 1: import the data

We import the data in *csv* format, the first column is the date, the second column the precipitation (pr) and the last column the average daily temperature (ta).

data <- read_csv("meteo_tenerife.csv") ## Parsed with column specification: ## cols( ## date = col_date(format = ""), ## pr = col_double(), ## ta = col_double() ## ) data ## # A tibble: 14,303 x 3 ## date pr ta #### 1 1981-01-02 0 17.6 ## 2 1981-01-03 0 16.8 ## 3 1981-01-04 0 17.4 ## 4 1981-01-05 0 17.6 ## 5 1981-01-06 0 17 ## 6 1981-01-07 0 17.6 ## 7 1981-01-08 0 18.6 ## 8 1981-01-09 0 19.8 ## 9 1981-01-10 0 21.5 ## 10 1981-01-11 3.8 17.6 ## # ... with 14,293 more rows

### Step 2: preparing the data

In the second step we prepare the data to calculate the anomalies. To do this, we create three new columns: the month, the year, and the season of the year. Since our objective is to analyse winter anomalies, we cannot use the calendar year, since winter includes the month of December of one year and the months of January and February of the following. The custom function `meteo_yr()`

extracts the year from a date indicating the starting month. The concept is similar to the hydrological year in which it starts on October 1.

meteo_yr <- function(dates, start_month = NULL) { # convert to POSIXlt dates.posix <- as.POSIXlt(dates) # year offset offset <- ifelse(dates.posix$mon >= start_month - 1, 1, 0) # new year adj.year = dates.posix$year + 1900 + offset return(adj.year) }

We will use many functions of the package collection `tidyverse`

(https://www.tidyverse.org/). The `mutate()`

function helps to add new columns or change existing ones. To define the seasons, we use the `case_when()`

function from the `dplyr`

package, which has many advantages compared to a chain of `ifelse()`

. In `case_when()`

we use two-side formulas, on the one hand the condition and on the other the action when that condition is met. A two-sided formula in R consists of the operator `~`

. The binary operator `%in%`

allows us to filter several values in a greater set.

data <- mutate(data, winter_yr = meteo_yr(date, 12), month = month(date), season = case_when(month %in% c(12,1:2) ~ "Winter", month %in% 3:5 ~ "Spring", month %in% 6:8 ~ "Summer", month %in% 9:11 ~ "Autum")) data ## # A tibble: 14,303 x 6 ## date pr ta winter_yr month season #### 1 1981-01-02 0 17.6 1981 1 Winter ## 2 1981-01-03 0 16.8 1981 1 Winter ## 3 1981-01-04 0 17.4 1981 1 Winter ## 4 1981-01-05 0 17.6 1981 1 Winter ## 5 1981-01-06 0 17 1981 1 Winter ## 6 1981-01-07 0 17.6 1981 1 Winter ## 7 1981-01-08 0 18.6 1981 1 Winter ## 8 1981-01-09 0 19.8 1981 1 Winter ## 9 1981-01-10 0 21.5 1981 1 Winter ## 10 1981-01-11 3.8 17.6 1981 1 Winter ## # ... with 14,293 more rows

### Step 3: estimate winter anomalies

In the next step we create a subset of the winter months. Then we group by the defined meteorological year and calculate the sum and average for precipitation and temperature, respectively. To facilitate the work, the `magrittr`

package introduces the operator called *pipe* in the form `%>%`

with the aim of combining several functions without the need to assign the result to a new object. The *pipe* operator passes the output of a function applied to the first argument of the next function. This way of combining functions allows you to chain several steps simultaneously. The `%>%`

must be understood and pronounced as *then*.

data_inv <- filter(data, season == "Winter") %>% group_by(winter_yr) %>% summarise(pr = sum(pr, na.rm = TRUE), ta = mean(ta, na.rm = TRUE))

Now we only have to calculate the anomalies of precipitation and temperature. The columns `pr_mean`

and `ta_mean`

will contain the climate average, the reference for the anomalies with respect to the normal period 1981-2010. Therefore, we need to filter the values to the period before 2010, which we will do in the usual way of filtering vectors in R. Once we have the references we estimate the anomalies `pr_anom`

and `ta_anom`

. To facilitate the interpretation, in the case of precipitation we express the anomalies as percentage, with the average set at 0% instead of 100%.

In addition, we add three required columns with information for the creation of the graph: 1) `labyr`

contains the year of each anomaly as long as it has been greater/less than -+10% or -+0.5ºC, respectively (this is for reducing the number of labels), 2) `symb_point`

is a dummy variable in order to be able to create diferent symbols between the cases of (1), and 3) `lab_font`

for highlighting in bold the year 2020.

data_inv <- mutate(data_inv, pr_mean = mean(pr[winter_yr <= 2010]), ta_mean = mean(ta[winter_yr <= 2010]), pr_anom = (pr*100/pr_mean)-100, ta_anom = ta-ta_mean, labyr = case_when(pr_anom < -10 & ta_anom < -.5 ~ winter_yr, pr_anom < -10 & ta_anom > .5 ~ winter_yr, pr_anom > 10 & ta_anom < -.5 ~ winter_yr, pr_anom > 10 & ta_anom > .5 ~ winter_yr), symb_point = ifelse(!is.na(labyr), "yes", "no"), lab_font = ifelse(labyr == 2020, "bold", "plain") )

## Creating the graph

We will build the chart adding layer by layer the distinctive elements: 1) the background with the different grids (Dry-Warm, Dry-Cold, etc.), 2) the points and labels, and 3) the style adjustments.

### Part 1

The idea is that the points with dry-warm anomalies are located in quadrant I (top-right) and those with wet-cold in quadrant III (bottom-left). Therefore, we must invert the sign in the precipitation anomalies. Then we create a `data.frame`

with the label positions of the four quadrants. For the positions in *x* and *y* `Inf`

and `-Inf`

are used, which is equivalent to the maximum panel sides with respect to the data. However, it is necessary to adjust the position towards the extreme points within the panel with the known arguments of `ggplot2`

: *hjust* and *vjust*.

data_inv_p <- mutate(data_inv, pr_anom = pr_anom * -1) bglab <- data.frame(x = c(-Inf, Inf, -Inf, Inf), y = c(Inf, Inf, -Inf, -Inf), hjust = c(1, 1, 0, 0), vjust = c(1, 0, 1, 0), lab = c("Wet-Warm", "Dry-Warm", "Wet-Cold", "Dry-Cold")) bglab ## x y hjust vjust lab ## 1 -Inf Inf 1 1 Wet-Warm ## 2 Inf Inf 1 0 Dry-Warm ## 3 -Inf -Inf 0 1 Wet-Cold ## 4 Inf -Inf 0 0 Dry-Cold

### Part 2

In the second part we can start building the chart by adding all graphical elements. First we create the background with different colors of each quadrant. The function `annotate()`

allows adding geometry layers without the use of variables within `data.frames`

. With the `geom_hline()`

and `geom_vline()`

function we mark the quadrants horizontally and vertically using a dashed line. Finally, we draw the labels of each quadrant, using the function `geom_text()`

. When we use other data sources than the main one used in `ggplot()`

, we must indicate it with the argument `data`

in the corresponding geometry function.

g1 <- ggplot(data_inv_p, aes(pr_anom, ta_anom)) + annotate("rect", xmin = -Inf, xmax = 0, ymin = 0, ymax = Inf, fill = "#fc9272", alpha = .6) + #wet-warm annotate("rect", xmin = 0, xmax = Inf, ymin = 0, ymax = Inf, fill = "#cb181d", alpha = .6) + #dry-warm annotate("rect", xmin = -Inf, xmax = 0, ymin = -Inf, ymax = 0, fill = "#2171b5", alpha = .6) + #wet-cold annotate("rect", xmin = 0, xmax = Inf, ymin = -Inf, ymax = 0, fill = "#c6dbef", alpha = .6) + #dry-cold geom_hline(yintercept = 0, linetype = "dashed") + geom_vline(xintercept = 0, linetype = "dashed") + geom_text(data = bglab, aes(x, y, label = lab, hjust = hjust, vjust = vjust), fontface = "italic", size = 5, angle = 90, colour = "white") g1

### Part 3

In the third part we simply add the points of the anomalies and the labels of the years. The `geom_text_repel()`

function is similar to the one known by default in `ggplot2`

, `geom_text()`

, but it repels overlapping text labels away from each other.

g2 <- g1 + geom_point(aes(fill = symb_point, colour = symb_point), size = 2.8, shape = 21, show.legend = FALSE) + geom_text_repel(aes(label = labyr, fontface = lab_font), max.iter = 5000, size = 3.5) g2 ## Warning: Removed 25 rows containing missing values (geom_text_repel).

### Part 4

In the last part we adjust, in addition to the general style, the axes, the color type and the (sub)title. Remember that we changed the sign on precipitation anomalies. Hence, we must use the arguments `breaks`

and `labels`

in the function `scale_x_continouous()`

to reverse the sign in the labels corresponding to the breaks.

g3 <- g2 + scale_x_continuous("Precipitation anomaly in %", breaks = seq(-100, 250, 10) * -1, labels = seq(-100, 250, 10), limits = c(min(data_inv_p$pr_anom), 100)) + scale_y_continuous("Mean temperature anomaly in ºC", breaks = seq(-2, 2, 0.5)) + scale_fill_manual(values = c("black", "white")) + scale_colour_manual(values = rev(c("black", "white"))) + labs(title = "Winter anomalies in Tenerife South", caption = "Data: AEMET\nNormal period 1981-2010") + theme_bw() g3 ## Warning: Removed 25 rows containing missing values (geom_text_repel).

**leave a comment**for the author, please follow the link and comment on their blog:

**R on Dominic Royé**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.