[This article was first published on R – Statistical Odds & Ends, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Isotonic regression is a method for obtaining a monotonic fit for 1-dimensional data. Let’s say we have data $(x_1, y_1), \dots, (x_n, y_n) \in \mathbb{R}^2$ such that $x_1 < x_2 < \dots < x_n$. (We assume no ties among the $x_i$‘s for simplicity.) Informally, isotonic regression looks for $\beta_1, \dots, \beta_n \in \mathbb{R}$ such that the $\beta_i$‘s approximate the $y_i$‘s well while being monotonically non-decreasing. Formally, the $\beta_i$‘s are the solution to the optimization problem

\begin{aligned} \text{minimize}_{\beta_1, \dots, \beta_n} \quad& \sum_{i=1}^n (y_i - \beta_i)^2 \\ \text{subject to} \quad& \beta_1 \leq \dots \leq \beta_n. \end{aligned}

(Note: There is a corresponding solution for a monotonically non-increasing fit. Sometimes, the above is referred to as linear ordering isotonic regression, with isotonic regression referring to a more general version of the problem above. For more general versions, see References 1 and 2.)

Isotonic regression is useful for enforcing a monotonic fit to the data. Sometimes you might know that your trend is monotonic but the output from your model is not monotonic. In that situation you can use isotonic regression as a smoother for the data or a post-processing step to force your model prediction to be monotonic.

A commonly used algorithm to obtain the isotonic regression solution is the pool-adjacent-violators algorithm (PAVA). It runs in linear time and linear memory. At a high level it works like this: go from left to right and set $\beta_i = y_i$. If setting $\beta_i$ as such causes a violation of monotonicity (i.e. $\beta_i = y_i < y_{i-1} = \beta_{i-1}$), replace both $\beta_i$ and $\beta_{i-1}$ with the mean $(y_{i-1} + y_i)/2$. This move may result in earlier violations (i.e. the new $\beta_{i-1}$ may be less than $\beta_{i-2}$): if that happens, we need to go back and fix it via averaging.

The animation below works through an example of how PAVA works. (Essentially I wrote a homebrew version of PAVA but kept a record of the intermediate fits.) Note that there are a number of ways to implement PAVA: what you see below may not be the most efficient. Click here for the R code; you can amend the dataset there to get your own version of the animation below.

If you want to do isotonic regression in R, DON’T use my homebrew version, use the gpava function in the isotone package instead.

References:

1. Stat Wiki. Isotonic regression.
2. Mair, P., Hornik, M., and de Leeuw, J. (2009). Isotone optimization in R: Pool-adjacent-violators algorithm (PAVA) and active set methods.