I gave a talk titled, “Parallel Computation in R: What We Want, and How We (Might) Get It,” at last week’s useR! 2017 conference in Brussels. You can view my slides here, and I think the conference organizers said the videos would be placed online, not sure of that though.
The goal of the talk was to propose general design patterns for parallel computation in R, meaning general approaches that should be useful in many applications. I emphasized that this was just one person’s opinion, and expected the Spark fans to disagree with my view that Spark is not a very useful tool for useRs. Actually, several speakers in other talks were negative about Spark as well. One gentleman did try to defend Spark during the Q&A, but he talked to me afterward, and turned out not to be a huge Spark fan after all, largely just playing the devil’s advocate.
My examples of course involved partools, the package I’ve been developing for parallel computation in R. (Duncan Temple Lang’s PhD student Clark Fitzgerald is now involved in developing the package as well.) However, I noted that the same general principles could be applied with some other packages, such as ddR and multidplyr.
There were of course a number of excellent talks, many more than I could attend. Among the ones I did attend, I would mention a few in particular:
- A talk by Nick Ulle, another student of Duncan’s, about his project to bring the LLVM compiler world to R. This is a tough challenge, but Nick is making impressive progress.
- A talk by Kylie Bemis, a post doc at Northeastern University, and her matter file system R package, which does distributed file allocation in a clever, general manner.
- I did not get to see Jim Harner’s talk about his R IDE, rc2 but he demonstrated it for me on his laptop, very interesting.
- Microsoft’s David Smith, one of the pioneers of the S/R world, gave an interesting “then and now” talk, listing questions that non-useRs would ask a few years ago when he suggested their switching to R — but which they no longer ask, demonstrating the huge increase in R usage in recent years, and its increase in power and usability.
My wife and I had fun exploring Brussels — one wrong decision in a subway station resulted in our ending up in front of the EU headquarters, an interesting error to make. And by an amazing stroke of good luck, the other summer conference at which I’ll be giving a talk, Small Area Estimation 2017, is to be held in Paris the very next week.