Here you will find daily news and tutorials about R, contributed by over 573 bloggers.
There are many ways to follow us - By e-mail:On Facebook: If you are an R blogger yourself you are invited to add your own R content feed to this site (Non-English R bloggers should add themselves- here)

The caret package for R provides a variety of error metrics for regression models and 2-class classification models, but only calculates Accuracy and Kappa for multi-class models. Therefore, I wrote the following function to allow caret:::train to calculate a wide variety of error metrics for multi-class problems:

This function was prompted by a question on cross-validated, asking what the optimal value of k is for a knn model fit to the iris dataset. I wanted to look at statistics besides accuracy and kappa, so I wrote a wrapper function for caret:::confusionMatrix and auc and logLoss from the Metric packages. Use the following code to fit a knn model to the iris dataset, aggregate all of the metrics, and save a plot for each metric to a pdf file:

This demonstrates that, depending on what metric you use, you will end up with a different model. For example, Accuracy seems to peak around 17:

While AUC and logLoss seem to peak around 6:

You can also increase the number of cross-validation repeats, or use a different method of re-sampling, such as bootstrap re-sampling.

Related

To leave a comment for the author, please follow the link and comment on their blog: Modern Toolmaking.