Don’t give up on single trees yet…. An interactive tree with Microsoft R

December 10, 2016
By

(This article was first published on R – Longhow Lam's Blog, and kindly contributed to R-bloggers)

tree

Introduction

A few days ago Microsoft announced their new Microsoft R Server 9.0 version. Among a lot of new things, it includes some new and improved machine learning algorithms in their MicrosoftML package.

  • Fast linear learner, with support for L1 and L2 regularization. Fast boosted decision tree. Fast random forest. Logistic regression, with support for L1 and L2 regularization.
  • GPU-accelerated Deep Neural Networks (DNNs) with convolutions. Binary classification using a One-Class Support Vector Machine.

And the nice thing is, the MicrosoftML package is now also available in the Microsoft R client version, which you can download and use for free.

Don’t give up on single trees yet….

Despite all the more modern machine learning algorithms, a good old single decision tree can still be useful. Moreover, in a business analytics context they can still keep up in predictive power. In the last few months I have created different predictive response and churn models. I usually just try different learners, logistic regression models, single trees, boosted trees, several neural nets, random forests. In my experience a single decision tree is usually ‘not bad’, often only slightly less predictive power than the more fancy algorithms.

An important thing in analytics is that you can ‘sell‘ your predictive model to the business. A single decision tree is a good way to to do just that, and with an interactive decision tree (created by Microsoft R) this becomes even more easy.

Here is an example: a decision tree to predict the survival of Titanic passengers.

The interactive version of the decision tree can be found on my GitHub.

Cheers, Longhow

To leave a comment for the author, please follow the link and comment on their blog: R – Longhow Lam's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)