Predictive modeling is fun. With random forest, xgboost, lightgbm and other elastic models…
Problems start when someone is asking how predictions are calculated.
Well, some black boxes are hard to explain.
And this is why we need good explainers.
In the June Aleksandra Paluszynska defended her master thesis Structure mining and knowledge extraction from random forest. Find the corresponding package randomForestExplainer and its vignette here.
In the September David Foster published a very interesting package xgboostExplainer. Try it to extract useful information from a xgboost model and create waterfall plots that explain variable contributions in predictions. Read more about this package here.
In the October Albert Cheng published lightgbmExplainer. Package with waterfall plots implemented for lightGBM models. Its usage is very similar to the xgboostExplainer package.
Waterfall plots that explain single predictions are great. They are useful also for linear models. So if you are working with lm() or glm() try the brand new breakDown package (hmm, maybe it should be named glmExplainer). It creates graphical explanations for predictions and has such a nice cheatsheet:
Install the package from https://pbiecek.github.io/breakDown/.
Thanks to RStudio for the cheatsheet’s template.