Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A useful feature of R is its ability to implement a function differently depending on the ‘class’ of the object acted on. This article explores this behaviour with reference to a playful modification of the ‘generic’ function plot() to allow plotting of cartoon bicycles. Although the example is quite simple and fun, the concepts it touches on are complex and serious.

The example demonstrates several of the programming language paradigms that R operates under. R is simultaneously object-orientated, functional and polymorphic. The example also demonstrates the paradigm of inheritance, through the passing of arguments from plot.bike() to plot() via the ... symbol. There has been much written about programming paradigms and R’s adherence to (or flouting of!) them. Two useful references on the subject are a Wikibook page on programming language paradigms and Hadley Wickham’s Advanced R book. There is a huge amount of information on these topics. For the purposes of the examples presented here suffice to say that R uses multiple paradigms and is extremely flexible.

## Context: an advanced R course

The behaviour of ‘generic functions’ such as plot was taught during a 2 day course by Colin Gillespie in Newcastle. In it we learned about some of the nuts and bolts that underlie R’s uniquely flexible and sometimes bizarre syntax. Environments, start-up, functions and classes were some of the topics covered. These and more issues are described in multiple places on-line and in R’s own documentation, and neatly synthesised in Hadley Wickham’s penultimate book, Advanced R. However, nothing beats face-to-face learning and I learned plenty about R’s innards during the course, despite having read around the topics covered previously.

Colin has made his materials available on-line on github for the benefit of people worldwide. http://rcourses.github.io/ contains links to pages which introduce a number of courses which can, to a large extent, be conducted from the safety of one’s home. There are also R packages for each of the courses. The package for the Advanced R course, for example, can be installed with the following code:

Once the package has been installed and loaded, with library(nclRadvanced), a number of vignettes and solutions sheets can be accessed, e.g. via:

vignette(package = "nclRadvanced")
vignette(package = "nclRadvanced", "practical2")

## Creating a new S3 class for bikes

The S3 class system is very flexible. Any object can be allocated to a class of any name, without restriction. S3 classes only become meaningful when objects allocated to particular class are passed to a function that recognises classes. Functions that behave differently depending on the class of the object they act on are known as generic.

We can find out the class type of an object using the pryr package. A good example of the S3 object type is “lm”, which plots in a different way thanks to plot.lm(), which dispatches the plot() command differently for objects within the lm S3 object class.

x <- 1:9
y <- x^2
m <- lm(y ~ x)
class(m)
## [1] "lm"
pryr::otype(m) # requires pryr to be installed
## [1] "S3"

Note that the object system is flexible, so any class name can be allocated to any object, such as class(x) <- "lm". Note that if we enter this, plot(x) will try to dispatch x to plot.lm() and fail.

Classes only become useful when they have a series of generic methods associated with them. We will illustrate this by defining a list as a ‘bike’ object and creating a plot.bike(), a class-specific method of the generic plot function for plotting S3 objects of that class. Let’s define the key components of a bike:

Not that there are no strict rules. We could allocate the class to any object, and we could replace bike with almost any name. The S4 class, used in spatial data for example, is much stricter.

The bike class becomes useful when it comes to method dispatch, such as plotting.

## Creating a plot method for bikes

Suppose that every bike object has the same as those contained in the object x created above. We can specify how it should be plotted as follows:

Now that a new method has been added to the generic plot() function, the fun begins. Any object assigned to the class ‘bike’ will now automatically be dispatched to plot.bike() when plot() is called.

And, as the plots below show, a plot of a bicycle is produced.

plot(x)

Try playing with the wheel size - some bikes with quite strange dimensions can be produced!

x$ws <- 1500 # a bike with large wheels plot(x) x$ws <- 150 # a bike with small wheels
plot(x)

It would be interesting to see how the dimensions of the last bicycle compare with a Brompton!

## Discussion

The bike class demonstrates that the power of S3 classes lies not in the class’s object but in the generic functions which take-on new methods. It is precisely this behaviour which makes the class family of Spatial* objects defined by the sp package so powerful. sp adds new methods for plot(), aggregate() and even the subsetting function "[".

This can be seen by calling methods() before and after sp is loaded:

methods(aggregate)
## [1] aggregate.data.frame aggregate.default*   aggregate.formula*
## [4] aggregate.ts
## see '?methods' for accessing help and source code
library(sp) # load the sp library, which creates new methods
methods(aggregate) # the new method is now shown
## [1] aggregate.data.frame aggregate.default*   aggregate.formula*
## [4] aggregate.Spatial*   aggregate.ts
## see '?methods' for accessing help and source code

Note that Spatial classes are different from the bike class because they use the S4 class system. We will be covering the nature and behaviour of Spatial objects in the “Spatial data analysis with R” course in Newcastle, 2nd - 3rd June, which is still open for registration.

The bike class is not ‘production’ ready but there is no reason why someone who understands bicycles inside out could not create a well-defined (perhaps S4) class for a bicycle, with all the essential dimensions defined. This could really be useful, including in efforts at making R more useful for transport planning, such as my package under development to provide tools for transportation research and analysis, stplanr.

Having learned about classes, I’m wondering whether origin-destination ‘flow’ data, used in stplanr, would benefit from its own class, or if its current definition as SpatialLinesDataFrame is sufficient. Any ideas welcome!

## Conclusion

Classes are an advanced topic in R the usually just ‘work’. However, if you want to modify existing functions to behave differently on new object-types, understanding how to create classes and class-specific methods can be very useful. The example of the bike class created above is not intended for production, but provides a glimpse into what is possible. At the very least, this article should help provide readers with new insight into the inner workings of R and its impressive functional flexibility.

## Post script

If you are interested in using R for transport research, please check out my under-development package stplanr and let me know via GitHub of any features you’d like it to have before submission to CRAN and rOpenSci:

https://github.com/Robinlovelace/stplanr

Or tweet me on @robinlovelace