# Drawing rarefaction curves with custom colours

April 16, 2015
By

[This article was first published on From the Bottom of the Heap - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I was sent an email this week by a vegan user who wanted to draw rarefaction curves using `rarecurve()` but with different colours for each curve. The solution to this one is quite easy as `rarecurve()` has argument `col` so the user could supply the appropriate vector of colours to use when plotting. However, they wanted to distinguish all 26 of their samples, which is certainly stretching the limits of perception if we only used colour. Instead we can vary other parameters of the plotted curves to help with identifying individual samples.

To illustrate, I’ll use the Barro Colorado Island data set `BIC` that comes with vegan. I just take the first 26 samples as this was the data set size my correspondent indicated they had available.

``````library("vegan")
``````
``````Loading required package: permute
This is vegan 2.2-1
``````
``````data(BCI, package = "vegan")
BCI2 <- BCI[1:26, ]
raremax <- min(rowSums(BCI2))
raremax
``````
``````[1] 340
``````

`raremax` is the minimum sample count achieved over the 26 samples. We will rarefy the sample counts to this value.

To set up the parameters we might use for plotting, `expand.grid()` is a useful helper function

``````col <- c("black", "darkred", "forestgreen", "orange", "blue", "yellow", "hotpink")
lty <- c("solid", "dashed", "longdash", "dotdash")
pars <- expand.grid(col = col, lty = lty, stringsAsFactors = FALSE)
``````
``````          col   lty
1       black solid
2     darkred solid
3 forestgreen solid
4      orange solid
5        blue solid
6      yellow solid
``````

Then we can call `rarecurve()` as follows with the new graphical parameters

``````out <- with(pars[1:26, ],
rarecurve(BCI2, step = 20, sample = raremax, col = col,
lty = lty, label = FALSE))
``````

Note that I saved the output from `rarecurve()` in object `out`. This object contains everything we need to draw our own version of the plot if we wish. For example, we could use fewer colours and alter the line thickness1 instead to make up the required number of combinations.

``````col <- c("black", "darkred", "forestgreen", "hotpink", "blue")
lty <- c("solid", "dashed", "dotdash")
lwd <- c(1, 2)
pars <- expand.grid(col = col, lty = lty, lwd = lwd,
stringsAsFactors = FALSE)
``````
``````          col    lty lwd
1       black  solid   1
2     darkred  solid   1
3 forestgreen  solid   1
4     hotpink  solid   1
5        blue  solid   1
6       black dashed   1
``````

Using the information in `out` returned by `rarecurve()` we can get almost the same plot using the following code to draw the elements by hand

``````Nmax <- sapply(out, function(x) max(attr(x, "Subsample")))
Smax <- sapply(out, max)
plot(c(1, max(Nmax)), c(1, max(Smax)), xlab = "Sample Size",
ylab = "Species", type = "n")
abline(v = raremax)
for (i in seq_along(out)) {
N <- attr(out[[i]], "Subsample")
with(pars, lines(N, out[[i]], col = col[i], lty = lty[i], lwd = lwd[i]))
}
``````

Having done this, I don’t believe this is a useful graphic because we’re trying to distinguish between too many samples using graphical parameters. Where I do think this sort of approach might work is if the samples in the data set come from a few different groups and we want to colour the curves by group.

``````col <- c("darkred", "forestgreen", "hotpink", "blue")
set.seed(3)
grp <- factor(sample(seq_along(col), nrow(BCI2), replace = TRUE))
cols <- col[grp]
``````

The code above creates a grouping factor `grp` for illustration purposes; in real analyses you’d have this already as a factor variable in you data somewhere. We also have to expand the `col` vector because we are plotting each line in a loop. The plot code, reusing elements from the previous plot, is shown below:

``````plot(c(1, max(Nmax)), c(1, max(Smax)), xlab = "Sample Size",
ylab = "Species", type = "n")
abline(v = raremax)
for (i in seq_along(out)) {
N <- attr(out[[i]], "Subsample")
lines(N, out[[i]], col = cols[i])
}
``````

1. We can’t use the approach outlined in this example to vary `lwd` because of the way `rarecurve()` draws the individual curves, in a loop. We have no way to tell `rarecurve()` to use the ith element of a vector of `lwd` values.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.