OO in R

September 13, 2012

(This article was first published on Digithead's Lab Notebook, and kindly contributed to R-bloggers)

The R Project

“Is there a package for obfuscating code in #rstats?”, someone asked. “The S4 object system?!” came the snarky reply. If you’re smiling right now, you know that it wouldn’t be funny if it weren’t at least a little bit true.

Options: S3, S4 or R5?

There can be little doubt that object oriented programming in R is the cause of some confusion. We’ll look at S4 classes more closely in a minute, but be warned that S4 classes are just one of at least three object systems available to the R programmer:

  • S3: simple and lightweight
  • S4: formal classes implemented by the methods package
  • R5: Reference classes

It’s not super clear when to use which, at least not to me. It seems to depend strongly on style and personal preference. The Bioconductor folks, for example, make heavy use of S4 classes. Google, on the other hand, advises to “avoid S4 objects and methods when possible“.

Here’s the way it looks to me. S3 classes feel a bit like Javascript classes – easy, loose and informal. S4 classes are rigid, verbose and harder to understand. But, they offer a better separation between interface and implementation, along with some advanced features like multiple dispatch, validation and type coercion. Reference classes (aka R5) encapsulate mutable state and look more like familiar Java-style classes. They’re new and pass-by-reference can violate expectations of R users.

An S4 class example

Now, let’s return to S4 classes with a simple example. First, we define a class to represent people.

# define an S4 class for people
  representation(name="character", age="numeric"),
  prototype(name=NA_character_, age=NA_real_)

A person has a name and an age, which default to NAs of their respective types – character string and numeric. For the sake of demonstrating polymorphism, let’s define a couple subclasses.

# define subclasses for different types of people


There’s no reason not to write normal R functions that take S4 classes as arguments. Polymorphism is called for when a method has different implementations for different classes. In that case, we declare a generic method.

# create a generic method called 'talent' that
# dispatches on the type of object it's applied to
  function(object) {

The following code implements two subtypes of person, each with a talent for something.

  function(object) {
    paste("Codes in", 
      paste([email protected], collapse=", "))

  function(object) {
    paste("Plays the",
      paste([email protected], collapse=", "))

Now, let’s make some talented people.

# create some talented people
donald <- new("Programmer",
  name="Donald Knuth",

coltrane <- new("Musician",
  name="John Coltrane",
  instrument=c("Tenor Sax", "Alto Sax"))

miles <- new("Musician",
  name="Miles Dewey Davis",

monk <- new("Musician",
  name="Theloneous Sphere Monk",

[1] "Plays the Trumpet"

[1] "Codes in MMIX"

[1] "Plays the Tenor Sax, Alto Sax"


One common stumbling block with S4 classes concerns changes in state. For instance, we might want to give our hard-working employees a raise.

  representation(boss="Person", salary="numeric"),
  contains = "Person"

  function(object, percent=0) {

  function(object, percent=0) {
    [email protected] <- [email protected] * (1+percent)

True to it’s functional heritage, R deals with immutable values. Changes in state happen by making new objects. The trick is to return the new object from the mutator methods and capture it on the way out.

smithers <- new("Employee",
  name="Waylon Smithers",
  boss=new("Person",name="Mr. Burns"),

# doesn't work?!?!
raise(smithers, percent=15)
[email protected]
[1] 100000

Setting a new salary creates a new value. Notice that we return the modified object from the raise function. Don’t forget to catch it.

# remember to reassign smithers to the new value
smithers <- raise(smithers, percent=15)
[email protected]
[1] 115000

Multiple Inheritance

Through the magic of multiple inheritance, the lowly Code Monkey is both a programmer and an employee. Just set the contains value to indicate its two parent classes.

setClass("Code Monkey",

  signature("Code Monkey"),
  function(object) {
    paste("Codes in",
      paste([email protected], collapse=", "),
        "for", [email protected]@name)

chris <- new("Code Monkey",
  boss=new("Person", name="The Man"),
  language=c("Java", "R", "Python", "Clojure"))

[1] "Codes in Java, R, Python, Clojure for The Man"

So, there you have it – encapsulation, polymorphism and inheritance in S4 classes. Complete code for this example is in gist:3670578.

OO in R resources

It’s lucky that there are loads of places to go to learn about S4 classes.

To leave a comment for the author, please follow the link and comment on their blog: Digithead's Lab Notebook.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)