R OOP – a little privacy please?

August 23, 2014
By

(This article was first published on Odd Hypothesis, and kindly contributed to R-bloggers)

As of late, I’ve been making heavy use of Reference Classes in R. They are easier for me to wrap my mind around since they adopt a usage style more like “traditional” OOP languages like Java. Primarily, object methods are part of the class definition and accessed via the instantiated object.

For instance:
With S3/S4 classes, you define an object. Then you define separate generic functions that operate on the object:

# class and object method definition
myClass
= setClass('myClass', ...)
print.myClass = function(x){...}

# so then ...
obj
= myClass(...)
print(obj)

With Reference classes, you define an object and therein the methods the object employs:

# class and object method definition
myClass
= setRefclass('myClass',
fields
= list(),
methods
= list(print=function(){...}))

# so then ...
obj
= myClass(...)
obj$print
()

In the grand scheme of things, both ways of defining objects and their methods are pretty much equivalent. From a coding perspective, the S3/S4 style allows for object methods to be defined separately of the object class (e.g. in separate files if one prefers). The appeal of Reference Classes is that the objects they define know what methods they have.

Privacy issues

The one aspect of OOP in R that I’ve been trying to work out is how to implement private methods and fields - i.e. only visible/usable from within the scope of the object, thus not user callable/mutable. There is (currently) no official way to specify these in base R.

A little research reveals that the best one can do is obfuscate.
Roxygen2 will not specify the existence of a RefClass method if it lacks a docstring, but it will still be available to the user if they introspect the object interactively (ala the tab key if using RStudio).

The best suggestions I’ve come across are:

  1. build a package around your RefClass definition and use non-exported package functions for private methods
  2. define private methods as functions within the public methods they are used in

Option 1 is probably the better of the two, as it lets R’s namespace rules do the dirty work. However, it does require writing functions of the form:

privateFun = function(obj, ...) {
# do stuff
obj$field
<<- newValue
}

Option 2 would likely require much code replication or, at the very least source()-ing the requisite code where ever a private function is required. A very far from ideal development/debugging situation.

From a high level view, Reference Classes are environments with added bells and whistles. What’s interesting is say I defined a class like so:

myClass = setRefClass('myClass',
fields
= list(
pubField
= 'character',
.prvField = 'character'
),
methods
= list(
pubMethod
= function(){print('public')},
.prvMethod = function(){print('private')}
)
)

Notice, that the “prvField” field and “prvMethod” use a . to prefix the name. In R this is a way of creating a “hidden” variable – akin to hidden files on a *nix OS.

When I try to ls() the resultant object, I get:

> obj = myClass()
> ls(env = obj)
[1] "getClass" "pubField"

So, it is within the realm of possibilities!

Another alternative that I thought of was to make a field of the object an environment and then place private elements there. Again, making things private via obfuscation (and more typing). However, users could still access said elements by:

obj$private$field

R6 - a new hope

As I was putting the finishing touches on this post I read a great post by Romain Francois entitled “Pro Grammar and Devel Hoper” (kudos on the pun). Towards the end he links to an Rpub posted by Winston Chang entitled “Introduction to R6” which piqued my interest.

R6 is a new OOP system provided by the R6 package - posted to CRAN just 4 days ago. While similar to the existing Reference Class system (objects are specially wrapped environments), it also provides separation of public and private elements - exactly what I was looking for! Performance tests also show that R6 is faster and more memory efficient than Reference Classes.

Suffice it to say, I’ll be checking R6 out.

Written with StackEdit.

To leave a comment for the author, please follow the link and comment on his blog: Odd Hypothesis.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.