**Biospherica » R**, and kindly contributed to R-bloggers)

The impacts of drought depend on time-scale. On short time-scales, drought means dry soil. On long time-scales, it means dry rivers and empty reservoirs. A region may simultaneously experience dry conditions on one time-scale and wet conditions on another e.g. wet soil but low streamflow or visa versa.

**Standardized Precipitation Index** (SPI) is a widely used measure of drought which can be defined for any time-scale of interest. For any location, SPI is normally distributed with zero mean and unit standard deviation. Index values > 2 indicate exceptionally wet conditions for that location, values < -2 indicate exceptionally dry conditions for that location, etc. Historical precipitation is the only input needed to compute SPI.

Australia experienced drought between 2002 and 2007. The image below shows SPI computed for a location in the drought-prone Murray-Darling basin of New South Wales. The time-series run from Jan 1948 to Jan 2010 and the index was calculated for time-scales from 1 to 12 months. Precipitation data is from **NCEP Reanalysis** [1] in a 1.875° × 1.875° grid cell centred at 30°S 145°E.

The drought of 2002 to 2007 shows up very clearly. It was preceeded by a wet period between 2005 and 2001. While 2009 showed an episode of severe drought at short time-scales, SPI at was normal/wet at longer time-scales during 2009. Agricultural yields recovered.

## Calculating SPI-*M*

Empirical rainfall probability distributions are far from normal (gaussian) and often approximate a shifted gamma distribution. The empirical cumulative probability distributions are used to transform the rainfall time-series into time-series of percentile probabilities. A normally distributed precipitation index is found by pretending that these percentile probabilities derive from a standard cumulative normal distribution and inverting to find the index values.

This is simple in *R*. If the vector *data *contains rainfall infall data, then:

`fit.cdf <- ecdf(data)`

cdfs <- sapply(data,fit.cdf)

SPI <- qnorm(cdfs)

Tha rainfall data are M-month moving averages (current and previous months). A separate index is calculated for each calendar month to remove seasonality. The R code used to compute SPI values (based in NCEP Reanalysis or other data sets such as **GCPC**) is **here**.

[1] The NCEP/NCAR 40-year reanalysis project, Bull. Amer. Meteor. Soc., 77, 437-470, 1996

**Noted Added 11 October 2011:** I have uploaded a slightly improved SPI R script** here**. The function *getPrecOnTimescale(precipitation,k)* takes a vector of monthly precipitation values and returns a k-month average (i.e current month and prior k-1 months). *getSPIfromPrec(precip.k)* takes k-month precipitation values and returns the corresponding vector of SPI values.

**leave a comment**for the author, please follow the link and comment on their blog:

**Biospherica » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...