Cheat sheet for prediction and classification models in R

August 9, 2012

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Ricky Ho has created a reference a 6-page PDF reference card on Big Data Machine Learning, with examples implemented in the R language. (A free registration to DZone Refcardz is required to download the PDF.) The examples cover:

  • Predictive modeling overview (how to set up test and training sets in R)
  • Linear regression (using lm)
  • Logistic regression (using glm)
  • Regression with regularization (using the glmnet package)
  • Neural networks (using nnet)
  • Support vector machines (using tune.svm from the e1071 package)
  • Naïve Bayes models (using naiveBayes from the e1071 package)
  • K-nearest-neighbors classification (using the knn function from the class package)
  • Decision trees (using rpart)
  • Ensembles of trees (using the randomForest package)
  • Gradient boosting (using the gbm package)

Neural network
The examples use the traditional built-in R data sets (such as the iris data, used to create the neural network above), so there's unfortunately not much of a "big data" aspect to the reference card. But if you're just getting started with prediction and classification models in R, this cheat sheet is a useful guide.

DZone Refcardz:  Big Data Machine Learning Patterns for Predictive Analytics

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)