Error metrics for multi-class problems in R: beyond Accuracy and Kappa

[This article was first published on Modern Toolmaking, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The caret package for R provides a variety of error metrics for regression models and 2-class classification models, but only calculates Accuracy and Kappa for multi-class models.  Therefore, I wrote the following function to allow caret:::train to calculate a wide variety of error metrics for multi-class problems:

This function was prompted by a question on cross-validated, asking what the optimal value of k is for a knn model fit to the iris dataset.  I wanted to look at statistics besides accuracy and kappa, so I wrote a wrapper function for caret:::confusionMatrix and auc and logLoss from the Metric packages.  Use the following code to fit a knn model to the iris dataset, aggregate all of the metrics, and save a plot for each metric to a pdf file:

This demonstrates that, depending on what metric you use, you will end up with a different model.  For example, Accuracy seems to peak around 17:
While AUC and logLoss seem to peak around 6:
You can also increase the number of cross-validation repeats, or use a different method of re-sampling, such as bootstrap re-sampling.

To leave a comment for the author, please follow the link and comment on their blog: Modern Toolmaking. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)