R code to accompany Real-World Machine Learning (Chapter 4)

[This article was first published on data prone - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Abstract

In the latest update to the rwml-R Github repo, I provide R code to accompany Chapter 4 of the book “Real-World Machine Learning” by Henrik Brink, Joseph W. Richards, and Mark Fetherolf. Topics covered include optimization of model parameters via grid search with caret, plotting a confusion matrix with ggplot2, and generating ROC curves with ROCR. This blog post provides a summary and some examples of the code contained in the update.

rwml-R project pages posted

For convenience, I’ve created a project page for rwml-R to post the generated HTML files from knitr. This (and Chapter 2 and Chapter 3) blog posts are short summaries of the R code provided in the rwml-R project. Also, feel free to fork the rwml-R repo and submit a pull request if you wish to contribute.

Plotting a confusion matrix

The MNIST dataset of handwritten digits makes another appearance. The kknn package is again used, and the confusion matrix is plotted using ggplot2. The color scale for the plot is generated using the RColorBrewer package.

Figure generated by above code

Plotting a series of ROC curves

The ROCR package is introduced and used to generate ROC curves. Also, AUC values are calculated for each curve and displayed along with each of the curves.

Figure generated by above code

Tuning model parameters

The caret package is used to tune parameters via grid search for the Support Vector Machines model with a Radial Basis Function Kernel. By setting summaryFunction = twoClassSummary in trainControl, the ROC curve is used to select the optimal model. The doMC package is also introduced for parallel computation.

Feedback welcome

If you have any feedback on the rwml-R project, please leave a comment below or use the Tweet button. Again, feel free to fork the rwml-R repo and submit a pull request if you wish to contribute.

Download Fork

To leave a comment for the author, please follow the link and comment on their blog: data prone - R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)