Assess Performance of the Classification Model

[This article was first published on Data Analysis in R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post Assess Performance of the Classification Model appeared first on finnstats.

If you are interested to learn more about data science, you can find more articles here finnstats.

Assess Performance of the Classification Model, We can evaluate a classification model’s effectiveness using a metric called the Matthews correlation coefficient (MCC).

It is determined by:

MCC = (TP*TN – FP*FN) / √(TP+FP)(TP+FN)(TN+FP)(TN+FN)

where:

TP: Number of true positives

TN: Number of true negatives

FP: Number of false positives

FN: Number of false negatives

This statistic is especially helpful when there is an imbalance between the two classes, meaning that one class appears substantially more frequently than the other.

Training and Testing Data in Machine Learning »

MCC’s value ranges from -1 to 1, depending on:

A score of -1 denotes a complete discrepancy between expected and actual classes.

0 is equivalent to making an entirely arbitrary guess.

Total agreement between expected and actual classes is indicated by a score of 1.

Consider the scenario where a sports analyst employs a logistic regression model to forecast the NBA draught status of 400 distinct school basketball players.

Assess Performance of the Classification Model

The model’s predictions are encapsulated in the confusion matrix below:

We can use the following formula to determine the model’s MCC:

MCC = (TP*TN – FP*FN) / √(TP+FP)(TP+FN)(TN+FP)(TN+FN)
MCC = (15*375-10*10) / sqrt((15+10)*(15+10)*(375+10)*(375+10))
MCC = 0.574026

The result is that Matthews’ correlation coefficient is 0.574026.

Indicating that the model does do a respectable job of forecasting whether or not players will be selected, this score is pretty near to one.

The following example uses R’s mcc() function from the mltools package to demonstrate how to calculate MCC for this specific circumstance.

What are the algorithms used in machine learning? »

An illustration is computing the Matthews correlation coefficient in R

The mcc() function from the mltools package is used to calculate the Matthews correlation coefficient after defining a vector of predicted classes and a vector of actual classes:

library(mltools)

Use the confusionM argument as follows to determine the Matthews correlation coefficient for a confusion matrix.

Now we can create a confusion matrix

confmatrix <- matrix(c(15, 10, 10, 375), nrow=2)

Let’s view the confusion matrix

confmatrix
     [,1] [,2]
[1,]   15   10
[2,]   10  375

Now we can calculate the Matthews correlation coefficient for the confusion matrix.

Python is superior to R for writing quality codes »

mcc(confusionM = confmatrix)
[1] 0.574026

Matthews’s correlation coefficient is 0.574026

If you are interested to learn more about data science, you can find more articles here finnstats.

The post Assess Performance of the Classification Model appeared first on finnstats.

To leave a comment for the author, please follow the link and comment on their blog: Data Analysis in R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)