[This article was first published on Alexej's blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Unfortunately this was not taught in any of my statistics or data analysis classes at university (wtf it so needs to be :scream_cat:).
So it took me some time until I learned that the AUC has a nice probabilistic meaning.
What’s AUC anyway?
AUC is the area under the ROC curve. The ROC curve is the receiver operating characteristic curve. AUC is simply the area between that curve and the x-axis. So, to understand AUC we need to look at the concept of an ROC curve.
A dataset : , where
is a vector of features collected for the th subject,
is the th subject’s label (binary outcome variable of interest, like a disease status, class membership, or whatever binary label).
A classification algorithm (such as logistic regression, SVM, deep neural net, or whatever you like), trained on , that assigns a score (or probability) to any new observation signifying how likely its label is .
A decision threshold (or operating point) can be chosen to assign a class label ( or ) to based on the value of .
The chosen threshold determines the balance between how many false positives and false negatives will result from this classification.
The area under the ROC curve, or AUC, is used as a measure of classifier performance.
Here is some R code for clarification (not even using tidyverse :stuck_out_tongue:):
# load some data, fit a logistic regression classifier
versicolor_virginica <- iris[iris$Species != "setosa", ]
logistic_reg_fit <- glm(Species ~ Sepal.Width + Sepal.Length,
data = versicolor_virginica,
family = "binomial")
y <- ifelse(versicolor_virginica$Species == "versicolor", 0, 1)
y_pred <- logistic_reg_fit$fitted.values
# get TPR and FPR at different values of the decision threshold
threshold <- seq(0, 1, length = 100)
FPR <- sapply(threshold,
sum(y_pred >= thresh & y != 1) / sum(y != 1)
TPR <- sapply(threshold,
sum(y_pred >= thresh & y == 1) / sum(y == 1)
# plot an ROC curve
A rather ugly ROC curve emerges:
The area under the ROC curve, or AUC, seems like a nice heuristic to evaluate and compare the overall performance of classification models independent of the exact decision threshold chosen. signifies perfect classification accuracy, and is the accuracy of making classification decisions via coin toss (or rather a continuous coin that outputs values in …).
Most classification algorithms will result in an AUC in that range.
But there’s more to it.
As above, assume that we are looking at a dataset where we want to distinguish data points of type 0 from those of type 1. Consider a classification algorithm that assigns to a random observation a score (or probability) signifying membership in class 1. If the final classification between class 1 and class 0 is determined by a decision threshold , then the true positive rate (a.k.a. sensitivity or recall) can be written as a conditional probability