Machine Learning for Hackers with Debian and Ubuntu

March 4, 2016

(This article was first published on - r-project, and kindly contributed to R-bloggers)

Data Science and Machine Learning are hot topics at the moment. Many people are considering how to extend their skills into these areas and many solutions have appeared, including full online degrees, free online courses combined with free software and for those who prefer hard copy, a staggering choice of books on the topic.

One of those books is O’Reilly’s Machine Learning for Hackers by John Myles White and Drew Conway. The book uses R to demonstrate a series of techniques for analysis and prediction. The book offers a great opportunity to simultaneously get an introduction to basic machine learning techniques and also an introduction to the increasingly popular R platform.

On page 11 they list all the major R packages needed to run their examples (available on Github).

I had a look over this list to see how many could be installed on a Debian system using apt-get and found that about half of them were already present. Five of them and one dependency, however, were not already available so I’ve whipped up packages for them and they are now in jessie-backports for all users of the current stable release.

If you are following the exercises in this book, you can get all the software you need with one convenient command:

$ sudo apt-get install -t jessie-backports 
  r-cran-ggplot2 r-cran-lme4 r-cran-rcurl 
  r-cran-reshape r-cran-xml r-cran-arm 
  r-cran-glmnet r-cran-igraph r-cran-lubridate 
  r-cran-rjsonio r-cran-tm

Thanks to all those who already packaged other parts of R and backported the relevant packages.

Note that the RJSONIO package’s authors have not provided a valid free software license so it is in non-free. It is there to support people using the book but I would encourage people to use RJSON for any new projects as it does have a valid license.

To leave a comment for the author, please follow the link and comment on their blog: - r-project. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)