In the latest update to the rwml-R Github repo, I provide R code to accompany Chapter 4 of the book “Real-World Machine Learning” by Henrik Brink, Joseph W. Richards, and Mark Fetherolf. Topics covered include optimization of model parameters via grid search with
caret, plotting a confusion matrix with
ggplot2, and generating ROC curves with
ROCR. This blog post provides a summary and some examples of the code contained in the update.
rwml-R project pages posted
For convenience, I’ve created a project page for rwml-R to post
the generated HTML files
knitr. This (and Chapter 2 and Chapter 3) blog posts
summaries of the R code provided in the rwml-R project.
Also, feel free to fork the rwml-R repo
and submit a pull request if you wish to contribute.
Plotting a confusion matrix
The MNIST dataset of handwritten digits makes another appearance.
kknn package is again used, and the confusion matrix is plotted
ggplot2. The color scale for the plot is generated using
Plotting a series of ROC curves
ROCR package is introduced and used to generate ROC curves.
Also, AUC values are calculated for each curve and displayed along with
each of the curves.
Tuning model parameters
caret package is used to tune parameters via grid search
for the Support Vector Machines model with a Radial Basis Function Kernel.
summaryFunction = twoClassSummary
ROC curve is used to select the optimal
doMC package is also introduced for parallel computation.