Why R? Warsaw 2019 Recap

Posted on October 4, 2019 by Marcin Dubel in R bloggers | 0 Comments

[This article was first published on r – Appsilon Data Science | End to End Data Science Solutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m after an exhausting yet exciting weekend with the WhyR? conference. That was the third edition, this year held in Warsaw, and it is nice to see how it grows each year. The Appsilon Data Science team really appreciates the initiative and the professionalism of organisation, thus we decided to be sponsors and prepare two talks during the conference!

As it is getting bigger, this time the initiator and founder of the WhyR? foundation Marcin Kosiński (@kosinski_rblog) was helped by Michał Burdukiewicz (@burdukiewicz) and Piotr Wójcik – head of the Data Science Lab at the Faculty of Economic Sciences of the University of Warsaw – the venue of this year’s conference. It was a nice nostalgic trip for me personally, as I graduated in econometrics and computer science at this faculty. I’m glad that since then the place was renovated and conference participants may enjoy it fully.

WhyR Workshops

The conference started for me at Friday’s workshop session. Participants can choose from a variety of topics, including deep learning, C++ in R, or Explainable AI (XAI). As I’m more focused on Shiny development these days, I wanted to catch up with data science so I chose “machine learning pipelines with {mlr3}” by Jakob Richter (@jak0br) and Patrick Schratz (@pjs_228). That was super useful! I was not aware of how the {drake} package can simplify the project workflow.

Thank you @pjs_228 for great workshop on `drake` package during @whyRconf ! ???? ???? Super smooth with using `usethis::use_course(“mlr-org/mlr3-learndrake”)` for sharing the materials ????I would love to see it in use on all #rstats conferences, lectures and workshops ????

— Marcin Dubel (@DubelMarcin) September 27, 2019

In the afternoon I joined the spatial data analysis workshop conducted by an expert in the field, Jakub Nowosad (@jakub_nowosad), who showed us the great features of the {tmap} package. If you’re interested in that field check the recent book on geocomputation with R available for free online.

????️Thank you @jakub_nowosad for showing `tmap` package on @whyRconf ???????? It’s basically #ggplot for maps!
I remember doing spatial analysis back in 2013 and it is great to see such progress in maps visualisation tools! ???? #rstats https://t.co/plkXnHG3ah

— Marcin Dubel (@DubelMarcin) September 27, 2019

There was also one special workshop that lasted the whole day, “modern Generalized Additive Models” by Matteo Fasiolo (@fasiolo1985). The course comes highly recommended, I wish I bilocated to be there!

WhyR Lectures

Saturday

The keynotes were all great, as well as some of the regular talks. There were some presentations prepared by the students, and it was great to see such ambitious projects they’re delivering and how they’re launching their conference careers.

Let me go through the most interesting presentations. The first keynote was a strong kick-off by Marvin Wright, the author of the {ranger} package. In the era of deep neural networks and the accompanying image recognition and deep fake videos hype it was super useful to be reminded that random forests are a simple yet powerful tool for down to earth data analysis. It deals greatly with noisy, high dimensional data and provides some interpretability with variables importance metrics. On the other hand, in production solutions there might be problems with performance – the predictions from random forest models are generated quite slowly. I loved the “mythbusters” format of the talk – going through the opinions on the random forests and confirming/denying them based on rigorous analysis.

Another great keynote was given by Jakub Nowosad (@jakub_nowosad) on the current challenges in the field of geo analysis, including map distortion. Did you know that our world without all of its water is potato-shaped? ????

Source: https://www.asu.cas.cz/~bezdek/vyzkum/rotating_3d_globe/index.php

Appsilon Talk

The most important event at the Saturday for the Appsilon team was Dr. Ken Benoit (@kenbenoit) of the London School of Economics and Damian Rodziewicz’s (@D_Rodziewicz) talk about the {quanteda} package for textual analysis. The package itself is a great tool, but I really admire Ken Benoit’s idea to share the possibilities that it gives to non-R users, or even non-programmers, via a user-friendly Shiny app prepared by the Appsilon team. Stay tuned for news about the upcoming release!

Dr. Kenneth Benoit and Damian Rodziewicz present quanteda at WhyR

Sunday

The day started with a not only excellent but also important talk by Steph Locke (@TheStephLocke) about data scientists’ responsibilities to society. It’s important to remember that at the end of the day, models and predictions may affect actual human lives. One smart take-away: when presenting model performance to business people, use not only metrics, but also demos of real use cases and check if the decision makers are fine with the models’ decisions.

Steph Locke’s speech was further expanded by Appsilon member Olga Mierzwa-Sulima (@olga_mie) in her talk about traits of world class scientists. Getting the results into useful solutions is a key factor! The talk was really well received and Olga got lots of questions. I guess the whole community is eager to see her give another speech ASAP.

Good takeout from @olga_mie talk at #whyR2019:
“A model without application is useless”

— Colin Fay ???? (@_ColinFay) September 29, 2019

A similar approach was supported by Wit Jakuczun (@WitJakuczun). Always be deploying! Delivering production-ready solutions creates the value for the client, not just a model, but also the environment, tests, deployment, and continuous integration.

Also worth mentioning is for sure Colin Gillespie’s (@csgillespie) talk about secure R code. The whole audience laughed at people who get hacked by really simple tricks. Don’t be the one! Since this talk I will check whether I spell “bioconductor” correctly three times each time! A great warning is that the biggest threat to any system is ourselves.

I also enjoyed the talks given by Theo Roe (@theoJRivers1), who showed us an amazing app about analyzing the water metrics in British rivers, as well as the presentation by Pablo Maldonado that featured video analysis to improve softball player performance.

Last but not least, the organisation was great, the coffee tasted good, the lunches were nice, and the party was an excellent opportunity for networking. I’m eagerly awaiting next year’s conference!

Thanks for reading! If you have photos or links to your presentations, add it to the comments below and/or ping me on Twitter @dubelmarcin. You can catch Damian Rodziewicz (D_Rodziewicz) at DataMass Gdansk presenting on “How to efficiently use huge satellite imagery datasets with Machine Learning.” If you are interested in Shiny application development, you can also check out my series Super Solutions for Shiny Architecture. And don’t forget to sign up for our newsletter!

Article Why R? Warsaw 2019 Recap comes from Appsilon Data Science | End to End Data Science Solutions.

To leave a comment for the author, please follow the link and comment on their blog: r – Appsilon Data Science | End to End Data Science Solutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Why R? Warsaw 2019 Recap

WhyR Workshops