Dzisiaj dzień liczby Pi. Dobry dzień na rozstrzygnięcie PIerwszej edycji konkursu Data Science Masters na najlepsza pracę magisterską. Ze zgłoszonych 72 prac trzeba było wybrać 3, które otrzymają nagrodę. Tematyka tych prac była bardzo różna (chmura słów po prawej została wygenerowana z tytułów i abstraktów). Prace zgłaszane były z całej Polski (statystyki dotyczące uczelni są … Czytaj dalej Rozstrzygnięcie konkursu Data Science Masters na najlepszą pracę z DS i ML

## How fractals helped my students to master package development in R

Last semester I taught an R programming at MIMUW. My lectures are project oriented, the second project was related to package development. The idea was straightforward: each team of students shall create a package that produces IFS fractals (based on iterated function systems). Each package shall have two generic functions: create() and plot(), documentation and … Czytaj dalej How fractals helped my students to master package development in R

## DALEX: how would you explain this prediction?

Last week I wrote about single variable explainers implemented in the DALEX package. They are useful to plot relation between a model output and a single variable. But sometimes we are more focused on a single model prediction. If our model predicts possible drug response for a patient, we really need to know which factors … Czytaj dalej DALEX: how would you explain this prediction?

## DALEX: understand a black box model – conditional responses for a single variable

Black-box models, like random forest model or gradient boosting model, are commonly used in predictive modelling due to their elasticity and high accuracy. The problem is, that it is hard to understand how a single variable affects model predictions. As a remedy one can use excellent tools like pdp package (Brandon Greenwell, pdp: An R … Czytaj dalej DALEX: understand a black box model – conditional responses for a single variable

## Top interactive visualizations of movie scripts

One of the highest pleasures for an academic teacher is to be surprised by an extraordinary student’s project or homework. Something that greatly exceeds expectations. I’ve reoriented my courses in a way to make such surprises frequent. The second project in my Data Visualisation classes was related to interactive graphics. The task was to create … Czytaj dalej Top interactive visualizations of movie scripts

## chRistmas tRees

Year over year, in the last classes before Christmas I ask my students to create a Christmas tree in R. Classes are about Techniques of data visualisation and usually, at this point, we are discussing interactive graphics and tools like rbokeh, ggiraph, vegalite, googleVis, D3, rCharts or plotly. I like this exercise because with most … Czytaj dalej chRistmas tRees

## archivist: Boost the reproducibility of your research

A few days ago Journal of Statistical Software has published our article (in collaboration with Marcin Kosiński) archivist: An R Package for Managing, Recording and Restoring Data Analysis Results. Why should you care? Let’s see. Starter Would you want to retrieve a ggplot2 object with the plot on the right? Just call the following line … Czytaj dalej archivist: Boost the reproducibility of your research

## Explain! Explain! Explain!

Predictive modeling is fun. With random forest, xgboost, lightgbm and other elastic models… Problems start when someone is asking how predictions are calculated. Well, some black boxes are hard to explain. And this is why we need good explainers. In the June Aleksandra Paluszynska defended her master thesis Structure mining and knowledge extraction from random … Czytaj dalej Explain! Explain! Explain!