I am continuing to learn
ggplot2 for elegant graphics. I often make a plot to illustrate the fit of a von Bertalanffy growth function to data. In general, I want this plot to have:
- Transparent points to address over-plotting of fish with the same length and age.
- A fitted curve with a confidence polygon over the range of observed ages.
- A fitted curve (without a confidence polygon) over a larger range than the observed ages (this often helps identify problematic fits).
Here I demonstrate how to produce such plots with lengths and ages of Lake Erie Walleye (Sander vitreus) captured during October-November, 2003-2014. These data are available in my
FSAdata package and formed many of the examples in Chapter 12 of the Age and Growth of Fishes: Principles and Techniques book. My primary interest here is in the
tl (total length in mm) and
age variables (see here for more details about the data). I focus on female Walleye from location “1” captured in 2014 in this example.
The workflow below requires understanding the minimum and maximum observed ages.
Fitting a von Bertalanffy Growth Function
Methods for fitting a von Bertalannfy growth function (VBGF) are detailed in my Introductory Fisheries Analyses with R book and in Chapter 12 of Age and Growth of Fishes: Principles and Techniques book. Briefly, a function for the typical VBGF is constructed with
Reasonable starting values for the optimization algorithm may be obtained with
vbStarts(), where the first argument is a formula of the form
ages are replaced with the actual variable names containing the observed lengths and ages, respectively, and
data= is set to the data.frame containing those variables.
nls() function is typically used to estimate parameters of the VBGF from the observed data. The first argument is a formula that has
lengths on the left-hand-side and the VBGF function created above on the right-hand-side. The VBGF function has the
ages variable as its first argument and then
t0 as the remaining arguments (just as they appear here). Again, the data.frame with the observed lengths and ages is given to
data= and the starting values derived above are given to
The parameter estimates are extracted from the saved
nls() object with
Bootstrapped confidence intervals for the parameter estimates are computed by giving the saved
nls() object to
Boot() and giving the saved
Boot() object to
Preparing Predicted Values for Plotting
Predicted lengths-at-age from the fitted VBGF is needed to plot the fitted VBGF curve. The
predict() function may be used to predict mean lengths at ages from the saved
What is need, however, is the predicted mean lengths at ages for each bootstrap sample, so that bootstrapped confidence intervals for each mean length-at-age can be derived. To do this with
predict() needs to be embedded into another function. For example, the function below does the same as
predict() but is in a form that will work with
Predicted mean lengths-at-age, with bootstrapped confidence intervals, can then be constructed by giving
Boot() the saved
nls() object AND the new prediction function in
Boot() code will thus compute the predicted mean length at all ages between -1 and 12 in increments of 0.22. I extended the age range outside the observed range of ages as I want to see the shape of the curve nearer t0 and at older ages (to better see L∞).
The vector of ages, the predicted mean lengths-at-age (from
predict()), and the associated bootstrapped confidence intervals (from
confint()) are placed into a data.frame for later use.
For my purposes below, I also want predicted mean lengths only for observed ages. To make the code below cleaner, a new data.frame restricted to the observed ages is made here.
Constructing the Plot
ggplot2 often starts by defining
aes()thetic mappings in
ggplot(). However, the data and aesthetics should not be set in
ggplot in this application because information will be drawn from three data.frames –
preds2. Thus, the data and aesthetics will be set within specific geoms.
The plot begins with a polygon that encases the lower and upper confidence interval values for mean length at each age. This polygon is constructed with
preds2 (the confidence polygon will only cover observed ages) where the x-axis will be
age and the minimum part of the y-axis will be
LCI and the maximum part of the y-axis will be
UCI. The fill color of the polygon is set with
Observed lengths and ages in the
wf14T data.frame were then added to this plot with
geom_point(). The points are slightly larger than the default (with
size=) and also with a fairly low transparency value to handle considerable over-plotting.
The fitted curve over the entire range of ages used above (i.e., using
preds1) is added with
geom_line(). A slightly thicker than default (
size=) dashed (
linetype=) line was used.
The fitted curve for just the observed range of ages (i.e., using
preds2) is added using a solid line so that the dashed line for the observed ages is covered.
The y- and x-axes are labelled (
name=), expansion factor for the axis limits is removed (
expand=c(0,0)) so that the point (0,0) is in the corner of the plot, and the axis limits (
limits=) and breaks (
breaks=) are controlled using
Finally, the classic black-and-white theme (primarily to remove the gray background) was used (
theme_bw() and the grid lines were removed (
BONUS – Equation on Plot
Below is an undocumented bonus for how to put the equation of the best-fit VBGM on the plot. This is hacky so I would not expect it to be very general (e.g., it likely will not work across facets).
This post is likely not news to those of you that are familiar with
ggplot2. However, I am trying to post some examples here as I learn
ggplot2 in hopes that it will help others. My first post was here. In my next post I will demonstrate how to show von Bertalanffy curves for two or more groups.
Other parameterizations of the VBGF can be used with
vbFuns(). Parameterizations of the Gompertz, Richards, and Logistic growth functions are available in
FSApackage. See here for documentation. The Schnute four-parameter growth model is available in
Schnute()and the Schnute-Richards five-parameter growth model is available in
Reduce the value of
seq()to make for a smoother VBGF curve when plotting later. ↩
This polygon will look better in the final plot when the gray background is removed. Also note that the polygon could be outlined by setting
color=to a color other than what is given in