Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

While developing some new simulation code with S4 system I stumbled upon some big difficulties in terms of computational efficiency. That lead me to diging into archives of Rhelp and Rdevel looking for clues. I found some interesting threads that address almost exactly the same problems that I do. Read for yourself here and here, including the follow-ups by John Chambers and others.

It is almost two years since I started to use S4 extensively for almost anything I develop in R. The transparency of the code and the ease of maintenance is so much greater in S4 than in S3. Not mentioning multiple inheritance, validity checks etc.

Things seem to have improved since 2003 as this example, based on one of the posts mentioned above gave back then:

setClass("MyClass", representation(x="numeric"))
system.time( structure(list(x=rep(1, 10^7)), class="MyS3Class") )
# [1] 1.05 0.00 1.05 NA NA
system.time( new("MyClass", x=rep(1, 10^7)) )
# [1] 3.15 0.34 11.19 NA NA


So at least 3 times slower in S4 case. Now, on my P4 3.2Ghz, with R 2.7.0 gave

setClass("MyClass", representation(x="numeric"))
system.time( structure( list(x=rep(1,10^7)), class="MyS3Class") )
#   user  system elapsed
#   0.74    0.19    0.94
system.time( new("MyClass",x=rep(1,10^7)))
#   user  system elapsed
#   0.80    0.18    1.06


which is comparable.

Nevertheless, I tried code profiling on my simulation and the output revealed that the majority of the CPU time was spent on method dispatch etc. so the difference might be still substantial. Right now my code works. Perhaps at some point I’ll port some portion to S3 and compare the results…