Not only LIME

June 18, 2018 | smarterpoland

I’ve heard about a number of consulting companies, that decided to use simple linear model instead of a black box model with higher performance, because ,,client wants to understand factors that drive the prediction’’. And usually the discussion goes as following: ,,We have tried LIME for our black-box model, ... [Read more...]

Ceteris Paribus Plots – a new DALEX companion

June 1, 2018 | smarterpoland

If you like magical incantations in Data Science, please welcome the Ceteris Paribus Plots. Otherwise feel free to call them What-If Plots. Ceteris Paribus (latin for all else unchanged) Plots explain complex Machine Learning models around a single observation. They supplement tools like breakDown, Shapley values, LIME or LIVE. In ... [Read more...]

ML models: What they can’t learn?

May 20, 2018 | smarterpoland

What I love in conferences are the people, that come after your talk and say: It would be cool to add XYZ to your package/method/theorem. After the eRum (great conference by the way) I was lucky to hear from Tal Galili: It would be cool to use DALEX ... [Read more...]

DALEX @ eRum 2018

May 15, 2018 | smarterpoland

DALEX invasion has started with the workshop and talk @ eRum 2018. Find workshop materials at DALEX: Descriptive mAchine Learning EXplanations. Tools for exploration, validation and explanation of complex machine learning models (thanks Mateusz Staniak for having the second part of the workshop). And my presentation Show my your model 2.0! (thanks go ... [Read more...]

DALEX Stories – Warsaw apartments

April 13, 2018 | smarterpoland

This Monday we had a machine learning oriented meeting of Warsaw R Users. Prof Bernd Bischl from LMU gave an excellent overview of mlr package (machine learning in R), then I introduced DALEX (Descriptive mAchine Learning EXplanations) and Mateusz Staniak introduced live and breakDown packages. The meeting pushed me to ... [Read more...]

DALEX: how would you explain this prediction?

February 26, 2018 | smarterpoland

Last week I wrote about single variable explainers implemented in the DALEX package. They are useful to plot relation between a model output and a single variable. But sometimes we are more focused on a single model prediction. If our model predicts possible drug response for a patient, we really ... [Read more...]

Top interactive visualizations of movie scripts

January 14, 2018 | smarterpoland

One of the highest pleasures for an academic teacher is to be surprised by an extraordinary student’s project or homework. Something that greatly exceeds expectations. I’ve reoriented my courses in a way to make such surprises frequent. The second project in my Data Visualisation classes was related to ... [Read more...]

chRistmas tRees

December 22, 2017 | smarterpoland

Year over year, in the last classes before Christmas I ask my students to create a Christmas tree in R. Classes are about Techniques of data visualisation and usually, at this point, we are discussing interactive graphics and tools like rbokeh, ggiraph, vegalite, googleVis, D3, rCharts or plotly. I like ... [Read more...]

archivist: Boost the reproducibility of your research

December 14, 2017 | smarterpoland

A few days ago Journal of Statistical Software has published our article (in collaboration with Marcin Kosiński) archivist: An R Package for Managing, Recording and Restoring Data Analysis Results. Why should you care? Let’s see. Starter Would you want to retrieve a ggplot2 object with the plot on ... [Read more...]

Explain! Explain! Explain!

December 3, 2017 | smarterpoland

Predictive modeling is fun. With random forest, xgboost, lightgbm and other elastic models… Problems start when someone is asking how predictions are calculated. Well, some black boxes are hard to explain. And this is why we need good explainers. In the June Aleksandra Paluszynska defended her master thesis Structure mining ... [Read more...]

intsvy: PISA for research and PISA for teaching

November 14, 2017 | smarterpoland

The Programme for International Student Assessment (PISA) is a worldwide study of 15-year-old school pupils’ scholastic performance in mathematics, science, and reading. Every three years more than 500 000 pupils from 60+ countries are surveyed along with their parents and school representatives. The study yields in more than 1000 variables concerning performance, attitude and ... [Read more...]

DIY – cheat sheets

March 20, 2017 | smarterpoland

I found recently, that in addition to a great list of cheatsheets designed by RStudio, one can also download a template for new cheatsheets from RStudio Cheat Sheets webpage. With this template you can design your own cheatsheet, and submit it to the collection of Contributed Cheatsheets (Garrett Grolemund will ... [Read more...]

Is it a job offer for a Data Scientist?

January 10, 2017 | smarterpoland

TL;DR Konrad Więcko and Krzysztof Słomczyński (with tiny help from my side) have created a system that is tracing what skills are currently in demand among job offers for data scientists in Poland. What skills, how frequent and how the demand is changing over time. The ... [Read more...]

PISA 2015 – how to read/process/plot the data with R

December 7, 2016 | smarterpoland

Yesterday OECD has published results and data from PISA 2015 study (Programme for International Student Assessment). It’s a very cool study – over 500 000 pupils (15-years old) are examined every 3 years. Raw data is publicly available and one can easily access detailed information about pupil’s academic performance and detailed data from ... [Read more...]

Program of the european R users meeting [only 7 days to go]

October 3, 2016 | smarterpoland

The european R users meeting [eRum] is going to start in just 7 days. We expect over 250 participants, 10 invited talks, 47 regular talks, 13 lightning talks and 12 posters. In order to handle that much content we scheduled 18 sessions [+ workshops]. Find the program of the conference here or here. In the … Czytaj dalej Program ... [Read more...]
