The fourth EARL London conference took place last week, and once again it was an enjoyable and informative showcase of practical applications of R. Kudos to the team from Mango for hosting a great event featuring interesting talks and a friendly crowd.
As always, there were more talks on offer than I was able to attend (most of the event was in three parallel tracks), but here are a few highlights that I was able to catch:
Jenny Bryan from RStudio gave an outstanding keynote: "Workflows: You Should Have One" (view the slides here). The talk was a really useful collection of good practices for data analysis with R. Note I didn't say "best practices" there, and that's something that resonated with me from Jenny's talk: sometimes the best practices are the ones that are "good enough" and don't add unnecessary complexity or fragility. Software Carpentry's "Good Enough Practices for Scientific Computing" is a great read that I hadn't come across before this talk.
Rachel Kirkham from the National Audit Office uses R and Shiny to scrutinize public spending on behalf of taxpayers in the UK. (View the slides here.) One interesting beneficial aspect of R for this application is the ability to bring R to the data. (Update Sep 21: I've edited this to remove my error mischaracterizing Rachel's talk. Rachel clarified via email: "We are deploying Shiny apps on client systems to eliminate the need to transfer data and associated risks of data loss. We are fully compliant with the Data Protection Act and associated data handling policies.")
Joy McKenney from Northumbrian Water gave a truly fascinating talk on using R to monitor and predict flows in a sewer system. (View the slides here.) One surprising application from this talk was being able to model whether an overflowing sewer line is due to rainfall or because of a blockage in the system — a tool that can be used to detect "fatbergs" clogging up the system.
Pieter Vos from Philips Research showed how they deploy applications to doctors to evaluate things like cancer risk. (View the slides here.) The applications call out to R to make the underlying statistical calculations.
Joe Cheng from RStudio gave a remarkable talk on a major concept for the R language itself: "promises". (View the slides here.) With promises, you can ask for the results of a long-running computation, but get back to the R command line instantaneously. The results will be available to you, well, when they're ready. It sounds like magic, but it's already working (in beta) with the promises package and promises (pun intended) to bring more responsiveness to Shiny applications.
Ashley Turner from Transport for London gave a talk that I was very sad I couldn't see (it conflicted with my own session), but fortunately the slides are available. R has been used for over 2 years to model how Londoners get about via car, bus and Tube. It was used for an official report on London transportation, which included some fascinating data on the routes certain commuters prefer for getting from (say) Kings Cross to Waterloo.
Matthew Upson from the Government Digital Service reported on the ongoing program to implement a reproducible data science workflow for creating official reports. (View the slides here.) He says that this process has reduced production time for some official reports by 75%, and you can learn more about the program here.
I was also informed by the organizers that my own talk on Reproducible Data Science with R (slides here) won the "award" for loudest presentation at the conference. I think the audio desk had the volume up a little too high for this video!
As I said, there are many more excellent talks beyond those listed above. You can explore the other talks by clicking here and then clicking through each speaker portrait — slides, where available, are linked at the bottom of each abstract.
The next EARL conference will take place in Boston, November 1-3 and promises to include even more practical applications of R. I hope to see you there!