Beth Ashlee, Data Scientist
After a successful first-ever EARL San Francisco in June, it was time to head back to the birth place EARL – London. With more abstracts submitted than ever before, the conference was made up of 54 fantastic talks and 5 key notes from an impressive selection of industries. With so many talks to pick from we thought we would summarise a few of my favourites!
Day 1 highlights:
After brilliant keynotes from Tom Smith (ONS) and Rstudio’s Jenny Bryan in session 1, Derek Norton and Neera Talbert from Microsoft took us through the Microsoft process of moving a company from SAS to R in session 2. They explained that with the aim of shrinking the ‘SAS footprint’, it’s important to think about the drivers behind a company leaving SAS as well as considering the impact to end users. Their approach focused on converting program logic rather than specific code.
After lunch, Luisa Pires discussed the digital change occurring within the Bank of England. She highlighted the key selling points behind choosing R as a platform and the process behind organizing their journey. They first ran divisional R training, before progressing through to produce a data science training programme to enable the use of R as a project tool.
Finishing up the day, Jessica Peterka-Bonetta gave a fascinating talk on sentiment analysis when considering the use of emojis. She demonstrated that even though adding emojis into your sentiment analysis can add complexity to your process, in the right context they can add real value to tracking the sentiment of a trend. It was an engaging talk which prompted some interesting audience questions such as – “What about combinations of emojis; how would they effect sentiment?”.
After a Pimms reception it was all aboard the Symphony Cruiser for a tour of the River Thames. On board we enjoyed food, drinks and live music (which resulted in some impromptu dancing, but what happens at a Conference, stays at the Conference!).
Day 2 highlights:
Despite a ‘lively’ evening, the EARL kicked off in full swing on Thursday morning. There were three fantastic keynotes, including Hilary Parker – her talk filled with analogies and movie references to describe her reproducible work flow methods – definitely something I could relate to!
The morning session included a talk from Mike Smith from Pfizer. Mike showed us his use of Shiny as a tool for determining wait times when submitting large jobs to Pfizer’s high performance compute grid. Mike used real time results to visualise whether it was beneficial to submit a large job at the current time or wait until later. He outlined some of the frustrations of changing data sources in a such a large company and his reluctance to admit he was ‘data science-ing’.
After lunch, my colleague Adnan Fiaz gave an interesting talk on the data pipeline, comparing it to the process involved in oil pipelines. He spoke of the methods and actions that must be taken before being able to process your data and generate results. The comparison showed surprising similarities between the two processes, and clarified the method that we must take at Mango to ensure we can safely, efficiently and aptly analyse data.
The final session of day 2 finished on a high, with a packed room gathering to hear from RStudio’s Joe Cheng. Joe highlighted the exciting new asynchronous programming feature that will be available in the next release of Shiny. This new feature is set to revolutionise the responsiveness of Shiny applications, allowing users to overcome the restrictions of R’s single threaded natured.
I get so much out of EARL each year and this year was no different; I just wish I had a timeturner to get to all the presentations!
On behalf of the rest of the Mango team, a massive thank you to all our speakers and attendees, we hope you enjoyed EARL as much as we did!
All slides we have permission to publish are available under the Speakers section on the EARL conference website.