Articles by Christoph Molnar

Interpretable Machine Learning with iml and mlr

April 29, 2018 | Christoph Molnar

Machine learning models repeatedly outperform interpretable, parametric models like the linear regression model. The gains in performance have a price: The models operate as black boxes which are not interpretable. Fortunately, there are many methods that can make machine learning models interpretable. The R package iml provides tools for analysing ...

[Read more...]

RuleFit: When disassembled trees meet Lasso

October 3, 2015 | Christoph Molnar

The RuleFit algorithm from Friedman and Propescu is an interesting regression and classification approach that uses decision rules in a linear model.RuleFit is not a completely new idea, but it combines a bunch of algorithms in a clever way. RuleFit consists of two components: The first component produces "rules" ... [Read more...]

Random Forest Almighty

February 6, 2014 | Christoph Molnar

Random Forests are awesome. They do not overfit, they are easy to tune, they tell you about important variables, they can be used for classification and regression, they are implemented in many programming languages and they are faster than their competitors (neural nets, boosting, support vector machines, ...)Let us take ... [Read more...]

From OpenOffice noob to control freak: A love story with R, LaTeX and knitr

March 8, 2013 | Christoph Molnar

Lately I had to write a seminar paper for a class and I decided to overdo it.But let's start at the very beginning. Here is my evolution of how I used to write stuff and how I got from this:to that:School: OpenOffice - I guess everyone has ... [Read more...]

Misusage of the new shiny package: A nerdy drink tracker for your next party

December 30, 2012 | Christoph Molnar

Currently a lot of people are talking about the new shiny package. So I got curious and built an own, more or less useful app: A drink trackerThis app can be used to track how much someone drank and therefore it is very useful for every party, especial...

[Read more...]

Get the party started

December 22, 2012 | Christoph Molnar

Have you already used trees or random forests to model a relationship of a response and some covariates? Then you might like the condtional trees, which are implemented in the party package.In difference to the CART (Classification and Regression ... [Read more...]

Trees with the rpart package

November 13, 2012 | Christoph Molnar

What are trees? Trees (also called decision trees, recursive partitioning) are a simple yet powerful tool in predictive statistics. The idea is to split the covariable space into many partitions and to fit a constant model of the response variable in each partition. In case of regression, the mean of ... [Read more...]

PCA or Polluting your Clever Analysis

August 31, 2012 | Christoph Molnar

When I learned about principal component analysis (PCA), I thought it would be really useful in big data analysis, but that's not true if you want to do prediction. I tried PCA in my first competition at kaggle, but it delivered bad results. This post illustrates how PCA can pollute ...

[Read more...]