This short post will explore a funny dataset that forms part of R’s
dataset library. The dataset will loaded out of interest of its content as well as using it to enagage with Hadley Wickhams’s ggplot2 package. The dataset that will be used to explore
ggplot2 contains two time series of temperatures of beavers. The data was first presented in Reynolds (1994).
Reynolds (1994) describes a small part of a study of the long-term temperature dynamics of beaver Castor canadensis in north-central Wisconsin. Body temperature was measured by telemetry every 10 minutes for four females, but data from a one period of less than a day for each of two animals is used here.
First we load the data from the
datasets library. This package encompasses some interesting works and is definately worth a look if you need generic data to experiment with.
Next, we want to use ggplot to see if the temperature of the beaver differs greatly between active and non-active times. A nice feature of ggplot is the easy incorporation of color differentials between factors. Here I am leaving the series as a continous variable to illustrate the easy use of the color parameter.
The main lesson I learned from using
ggplot2 is the trick to create a template to work off of in the future. So think of creating containers for different shapes, colors, labels, headings and of course statistical visualization of the relationship of the data with the
geom_smooth parameters. This template can then be easily copy and pasted to your other scripts without having to go re-engage with the vignette to find a specific feature.
One of the features that is easily implemented but can have high impact is the
facets parameter and thus should form as part of your base template. This easily splits your visualization window into multiple plots to get a clearer idea of the dynamics of specific factors.
A feature of the package that does irritate me, is the grey backrgound that is set as the default. A much clearer and widely used format is the theme of a white background with black gridlines.
This concludes our short discussion on the use of the ggplot2 package. I do find having learned to plot with base feature in R, the notational difference is difficult to integrate when I am coding and it doesn’t come naturally as of yet. The package is however flawlessy built with great flexibility in its features. I especially enjoy how it integrates with the Caret machine learning package.
Despite not using the package on a day to day basis, after using it again to write this post, I do find myself thinking, why am I not…