Recursive assignment

June 2, 2014
By

(This article was first published on PirateGrunt » R, and kindly contributed to R-bloggers)

Here’s yet another example where I just need to read the help files. Before I go on, I should add my own notion as to why that’s not always easy to do. On loads of message boards, you’ll see people say- correctly- that the documentation is very clear on XYZ. True. But that’s only relevant if you read the bit of the documentation that actually matters to you and you have all of the context you need to understand the terse (though accurate!) descriptions there. It’s a bit like a bus schedule in Samarkand. Absolutely clear and useful if you’re in central Asia and know where you are and where you need to go and when you need to get there. If you’ve been walking the Silk Road for weeks and can’t tell Samarkand from Tashqent, that bus schedule may not do you as much good. So it is with R documentation. Sometimes you’ll have to dust off your shoes, get patient and ask a stranger for help.

So what I had wanted to do was to understand something fairly basic. How is the following statement processed:

myObject$MyColumn[2] = "New value"

This is a typical method to manipulate individual cells in a data frame and a very natural way to structure custom R objects. So, when creating my own objects, how do I implement it? If there is customization, where does it take place? Do I access the element in the $ or the [] first? What assignment operator is being used?

To investigate, I created a very simple object with easy properties that I could assign.

setClass("Person", representation(FirstName = "character", LastName = "character", 
    Birthday = "Date"))

I then created two easy access and set methods. For reasons that will become clear in a moment, I also added a statement to indicate when the methods had been called.

setMethod("$", signature(x = "Person"), function(x, name) {
    print("Just called $ accessor")
    arguments <- as.list(match.call())
    slot(x, name)
})
setMethod("$<-", signature(x = "Person"), function(x, name, value) {
    print("Just called $ assignment")
    arguments <- as.list(match.call())
    slot(x, name) = value
    x
})

And I created a new object.

objPeople = new("Person", FirstName = c("Ambrose", "Victor", "Jules"), LastName = c("Bierce", 
    "Hugo", "Verne"), Birthday = seq(as.Date("2001/01/01"), as.Date("2003/12/31"), 
    by = "1 year"))

So, I can access the properties and my methods will tell me when they've been accessed. I can also assign to the member and I’ll be told when that happens as well.

objPeople$FirstName
## [1] "Just called $ accessor"
## [1] "Ambrose" "Victor"  "Jules"
objPeople$FirstName = "Joe"
## [1] "Just called $ assignment"

Now here’s the interesting bit. (Interesting if you’ve just gotten to the train station in Samarkand and are trying to find your hotel. Not so interesting if you’ve been in Uzbekistan for a few weeks.)

objPeople$FirstName[2] = "Joe"
## [1] "Just called $ accessor"
## [1] "Just called $ assignment"

The assignment produced a call to the accessor function? Why? The answer may be found in one of two places. One is the very clear, concise and speedy answer that I got to a question I posed on StackOverflow, which may be read here. Two is the R documentation, which may be found here.

This will tell us that the following two sets of statements are equivalent. (For the rest of the post, I’m suppressing output, so the messages about when the ‘$’ operators are called will not appear.)

objPeople$FirstName[2] = "Joe"

`*tmp*` <- objPeople
objPeople <- `$<-`(`*tmp*`, name = "FirstName", value = `[<-`(`*tmp*`$FirstName, 
    2, value = "Joe"))

So what’s happening? When I want to assign to a subset, three things take place. First, I use my accessor to sort out precisely which value I’m extracting from. Next, I use bracket assignment to alter the elements of a subset of that vector. Finally, I assign the whole vector back to the component of my object. This is a bit easier to see, if we take the steps one at a time.

gonzo = objPeople$FirstName
mojo = `[<-`(gonzo, 2, value = "Joe")
objPeople = `$<-`(objPeople, "FirstName", mojo)

This is why the accessor is not called if there is no subset in the assignment. In that case, the equivalent expression is simply the following:

objPeople = `$<-`(objPeople, "FirstName", "Joe")

Welcome to Uzbekistan. Please enjoy our fine network of buses.


To leave a comment for the author, please follow the link and comment on his blog: PirateGrunt » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.