Weather forecast with regression models – part 3

June 11, 2017
By

(This article was first published on R Programming – DataScience+, and kindly contributed to R-bloggers)

In the previous part of this tutorial, we build several models based on logistic regression. One aspect to be further considered is the decision threshold tuning that may help in reaching a better accuracy. We are going to show a procedure able to determine an optimal value for such purpose. ROC plots will be introduced as well.

Decision Threshold Tuning

As default, caret uses 0.5 as threshold decision value to be used as probability cutoff value. We are going to show how to tune such decision threshold to achieve a better training set accuracy. We then compare how the testing set accuracy changes accordingly.

Tuning Analysis

We load previously saved training, testing datasets together with the models we have already determined.

```suppressPackageStartupMessages(library(caret))
suppressPackageStartupMessages(library(ROCR))
set.seed(1023)
```

The following tuning procedure explores a pool of cutoff values giving back a table reporting {cutoff, accuracy, specificity} triplets. Its inspection allows to identify the cutoff value providing with the highest accuracy. In case of a tie in accuracy value, we choose the lowest cutoff value of such.

```glm.tune <- function(glm_model, dataset) {
results <- data.frame()
for (q in seq(0.2, 0.8, by = 0.02)) {
fitted_values <- glm_model\$finalModel\$fitted.values
prediction = q, "Yes", "No")
cm <- confusionMatrix(prediction, dataset\$RainTomorrow)
accuracy <- cm\$overall["Accuracy"]
specificity <- cm\$byClass["Specificity"]
results <- rbind(results, data.frame(cutoff=q, accuracy=accuracy, specificity = specificity))
}
rownames(results) <- NULL
results
}
```

In the utility function above, we included also the specificity metrics. Specificity is defined as TN/(TN+FP) and in the next part of this series we will discuss some aspects of. Let us show an example of confusion matrix with some metrics computation to clarify the terminology.

```    Reference
Prediction  No Yes
No  203  42
Yes   0   3

Accuracy : 0.8306   = (TP+TN)/(TP+TN+FP+FN) = (203+3)/(203+3+42+0)
Sensitivity : 1.00000  = TP/(TP+FN) = 203/(203+0)
Specificity : 0.06667  = TN/(TN+FP) = 3/(3+42)
Pos Pred Value : 0.82857  = TP/(TP+FP) = 203/(203+42)
Neg Pred Value : 1.00000  = TN/(TN+FN) = 3/(3+0)
```

Please note that the “No” column is associated to a positive outcome, whilst “Yes” to a negative one. Specificity measure how good we are in predict that {RainTomorrow = “No”} and actually it is and this is something that we are going to investigate in next post of this series.

The logistic regression models to be tuned are:

```mod9am_c1_fit: RainTomorrow ~ Cloud9am + Humidity9am + Pressure9am + Temp9am
mod3pm_c1_fit: RainTomorrow ~ Cloud3pm + Humidity3pm + Pressure3pm + Temp3pm
mod_ev_c2_fit: RainTomorrow ~ Cloud3pm + Humidity3pm + Pressure3pm + Temp3pm + WindGustDir
mod_ev_c3_fit: RainTomorrow ~ Pressure3pm + Sunshine
```

In the following analysis, for all models we determine the cutoff value providing with the highest accuracy.

```glm.tune(mod9am_c1_fit, training)
cutoff  accuracy specificity
1    0.20 0.7822581  0.73333333
2    0.22 0.7862903  0.71111111
3    0.24 0.7943548  0.68888889
4    0.26 0.8064516  0.64444444
5    0.28 0.8104839  0.64444444
6    0.30 0.8145161  0.60000000
7    0.32 0.8185484  0.57777778
8    0.34 0.8185484  0.51111111
9    0.36 0.8104839  0.46666667
10   0.38 0.8266129  0.46666667
11   0.40 0.8266129  0.40000000
12   0.42 0.8306452  0.40000000
13   0.44 0.8266129  0.37777778
14   0.46 0.8346774  0.35555556
15   0.48 0.8467742  0.33333333
16   0.50 0.8508065  0.31111111
17   0.52 0.8427419  0.26666667
18   0.54 0.8467742  0.26666667
19   0.56 0.8346774  0.20000000
20   0.58 0.8387097  0.20000000
21   0.60 0.8387097  0.20000000
22   0.62 0.8346774  0.17777778
23   0.64 0.8387097  0.17777778
24   0.66 0.8346774  0.15555556
25   0.68 0.8387097  0.15555556
26   0.70 0.8387097  0.13333333
27   0.72 0.8387097  0.13333333
28   0.74 0.8346774  0.11111111
29   0.76 0.8266129  0.06666667
30   0.78 0.8266129  0.06666667
31   0.80 0.8306452  0.06666667

```

The optimal cutoff value is equal to 0.5. In this case, we find the same cutoff value as the default the caret package takes advantage of for logistic regression models.

```opt_cutoff < 0.5;-
pred_test <- predict(mod9am_c1_fit, testing, type = "prob")
prediction = ifelse(pred_test\$Yes >= opt_cutoff, "Yes", "No")
confusionMatrix(prediction, testing\$RainTomorrow)
Confusion Matrix and Statistics

Reference
Prediction No Yes
No  80  12
Yes  6   7

Accuracy : 0.8286
95% CI : (0.7427, 0.8951)
No Information Rate : 0.819
P-Value [Acc > NIR] : 0.4602

Kappa : 0.3405
Mcnemar's Test P-Value : 0.2386

Sensitivity : 0.9302
Specificity : 0.3684
Pos Pred Value : 0.8696
Neg Pred Value : 0.5385
Prevalence : 0.8190
Detection Rate : 0.7619
Detection Prevalence : 0.8762
Balanced Accuracy : 0.6493

'Positive' Class : No

```

Then, tuning the 3PM model.

```glm.tune(mod3pm_c1_fit, training)
cutoff  accuracy specificity
1    0.20 0.8064516   0.7777778
2    0.22 0.8185484   0.7555556
3    0.24 0.8225806   0.7333333
4    0.26 0.8306452   0.6888889
5    0.28 0.8467742   0.6666667
6    0.30 0.8467742   0.6444444
7    0.32 0.8427419   0.6222222
8    0.34 0.8669355   0.6222222
9    0.36 0.8709677   0.6222222
10   0.38 0.8629032   0.5777778
11   0.40 0.8669355   0.5777778
12   0.42 0.8669355   0.5555556
13   0.44 0.8548387   0.4666667
14   0.46 0.8548387   0.4444444
15   0.48 0.8588710   0.4444444
16   0.50 0.8669355   0.4444444
17   0.52 0.8629032   0.4222222
18   0.54 0.8669355   0.4222222
19   0.56 0.8669355   0.3777778
20   0.58 0.8669355   0.3777778
21   0.60 0.8588710   0.3333333
22   0.62 0.8548387   0.3111111
23   0.64 0.8508065   0.2888889
24   0.66 0.8467742   0.2666667
25   0.68 0.8387097   0.2222222
26   0.70 0.8387097   0.2222222
27   0.72 0.8346774   0.2000000
28   0.74 0.8387097   0.1777778
29   0.76 0.8346774   0.1555556
30   0.78 0.8346774   0.1555556
31   0.80 0.8306452   0.1111111

```

The optimal cutoff value is equal to 0.36.

```opt_cutoff <- 0.36
pred_test <- predict(mod3pm_c1_fit, testing, type="prob")
prediction = ifelse(pred_test\$Yes >= opt_cutoff, "Yes", "No")
confusionMatrix(prediction, testing\$RainTomorrow)
Confusion Matrix and Statistics

Reference
Prediction No Yes
No  81   5
Yes  5  14

Accuracy : 0.9048
95% CI : (0.8318, 0.9534)
No Information Rate : 0.819
P-Value [Acc > NIR] : 0.0112

Kappa : 0.6787
Mcnemar's Test P-Value : 1.0000

Sensitivity : 0.9419
Specificity : 0.7368
Pos Pred Value : 0.9419
Neg Pred Value : 0.7368
Prevalence : 0.8190
Detection Rate : 0.7714
Detection Prevalence : 0.8190
Balanced Accuracy : 0.8394

'Positive' Class : No

```

We improved the test accuracy, previously resulting as equal to 0.8857 and now equal to 0.9048 (please take a look at part #2 of this tutorial to know about the test accuracy with 0.5 cutoff value). Then, the first evening hours model.

```glm.tune(mod_ev_c2_fit, training)
cutoff  accuracy specificity
1    0.20 0.8387097   0.8444444
2    0.22 0.8467742   0.8222222
3    0.24 0.8548387   0.8222222
4    0.26 0.8629032   0.8000000
5    0.28 0.8588710   0.7333333
6    0.30 0.8709677   0.7333333
7    0.32 0.8790323   0.7111111
8    0.34 0.8830645   0.6888889
9    0.36 0.8870968   0.6666667
10   0.38 0.8870968   0.6666667
11   0.40 0.8951613   0.6666667
12   0.42 0.8991935   0.6666667
13   0.44 0.8991935   0.6666667
14   0.46 0.8951613   0.6444444
15   0.48 0.8951613   0.6222222
16   0.50 0.8951613   0.6222222
17   0.52 0.8951613   0.6000000
18   0.54 0.8911290   0.5777778
19   0.56 0.8911290   0.5777778
20   0.58 0.8951613   0.5777778
21   0.60 0.8911290   0.5555556
22   0.62 0.8870968   0.5111111
23   0.64 0.8830645   0.4888889
24   0.66 0.8830645   0.4666667
25   0.68 0.8790323   0.4222222
26   0.70 0.8709677   0.3555556
27   0.72 0.8548387   0.2666667
28   0.74 0.8508065   0.2444444
29   0.76 0.8427419   0.1777778
30   0.78 0.8387097   0.1555556
31   0.80 0.8387097   0.1555556

```

The optimal cutoff value is equal to 0.42.

```opt_cutoff <- 0.42
pred_test <- predict(mod_ev_c2_fit, testing, type="prob")
prediction = ifelse(pred_test\$Yes >= opt_cutoff, "Yes", "No")
confusionMatrix(prediction, testing\$RainTomorrow)
Confusion Matrix and Statistics

Reference
Prediction No Yes
No  82   8
Yes  4  11

Accuracy : 0.8857
95% CI : (0.8089, 0.9395)
No Information Rate : 0.819
P-Value [Acc > NIR] : 0.04408

Kappa : 0.58
Mcnemar's Test P-Value : 0.38648

Sensitivity : 0.9535
Specificity : 0.5789
Pos Pred Value : 0.9111
Neg Pred Value : 0.7333
Prevalence : 0.8190
Detection Rate : 0.7810
Detection Prevalence : 0.8571
Balanced Accuracy : 0.7662

'Positive' Class : No

```

We did not improve the accuracy as the default cutoff value provides with the same. Other metrics changed and in particular we notice that the sensitivity decreased a little bit while the positive predicted value improved. Then the second evening hours model.

```glm.tune(mod_ev_c3_fit, training)
cutoff  accuracy specificity
1    0.20 0.8225806   0.8000000
2    0.22 0.8145161   0.7333333
3    0.24 0.8064516   0.6666667
4    0.26 0.8145161   0.6666667
5    0.28 0.8145161   0.6222222
6    0.30 0.8185484   0.6222222
7    0.32 0.8266129   0.6000000
8    0.34 0.8185484   0.5555556
9    0.36 0.8306452   0.5333333
10   0.38 0.8467742   0.5333333
11   0.40 0.8467742   0.5111111
12   0.42 0.8427419   0.4666667
13   0.44 0.8508065   0.4666667
14   0.46 0.8467742   0.4222222
15   0.48 0.8467742   0.4000000
16   0.50 0.8508065   0.4000000
17   0.52 0.8548387   0.4000000
18   0.54 0.8508065   0.3777778
19   0.56 0.8669355   0.3777778
20   0.58 0.8669355   0.3555556
21   0.60 0.8629032   0.3333333
22   0.62 0.8588710   0.3111111
23   0.64 0.8548387   0.2888889
24   0.66 0.8508065   0.2666667
25   0.68 0.8467742   0.2444444
26   0.70 0.8387097   0.2000000
27   0.72 0.8387097   0.1777778
28   0.74 0.8346774   0.1555556
29   0.76 0.8306452   0.1333333
30   0.78 0.8306452   0.1333333
31   0.80 0.8306452   0.1333333

```

The optimal cutoff value is equal to 0.56.

```opt_cutoff <- 0.56
pred_test <- predict(mod_ev_c3_fit, testing, type="prob")
prediction = ifelse(pred_test\$Yes >= opt_cutoff, "Yes", "No")
confusionMatrix(prediction, testing\$RainTomorrow)
Confusion Matrix and Statistics

Reference
Prediction No Yes
No  82   8
Yes  4  11

Accuracy : 0.8857
95% CI : (0.8089, 0.9395)
No Information Rate : 0.819
P-Value [Acc > NIR] : 0.04408

Kappa : 0.58
Mcnemar's Test P-Value : 0.38648

Sensitivity : 0.9535
Specificity : 0.5789
Pos Pred Value : 0.9111
Neg Pred Value : 0.7333
Prevalence : 0.8190
Detection Rate : 0.7810
Detection Prevalence : 0.8571
Balanced Accuracy : 0.7662

'Positive' Class : No

```

In this case, we did not improved the accuracy of this model. Other metrics changed and in particular we notice that the sensitivity is slightly higher than the default cutoff driven value (0.9419), and the positive predicted value decreased a little bit (for default cutoff was 0.9205).

ROC analysis

To have a visual understanding of the classification performances and compute further metrics, we are going to take advantage of the ROCr package facilities. We plot ROC curves and determine the corresponding AUC value. That is going to be done for each model. The function used to accomplish with this task is the following.

```glm.perf.plot <- function (prediction, cutoff) {
perf <- performance(prediction, measure = "tpr", x.measure = "fpr")
par(mfrow=(c(1,2)))
plot(perf, col="red")
grid()
perf <- performance(prediction, measure = "acc", x.measure = "cutoff")
plot(perf, col="red")
abline(v = cutoff, col="green")
grid()
auc_res <- performance(prediction, "auc")
[email protected][]
}
```

In the following, each model will be considered, AUC value printed out and ROC plots given.

```mod9am_pred_prob <- predict(mod9am_c1_fit, testing, type="prob")
mod9am_pred_resp <- prediction(mod9am_pred_prob\$Yes,  testing\$RainTomorrow)
glm.perf.plot(mod9am_pred_resp, 0.5)
 0.8004896

``` ```mod3pm_pred_prob <- predict(mod3pm_c1_fit, testing, type="prob")
mod3pm_pred_resp <- prediction(mod3pm_pred_prob\$Yes,  testing\$RainTomorrow)
glm.perf.plot(mod3pm_pred_resp, 0.36)
 0.9155447

```

The 3PM model shows the highest AUC value among all the models. ```mod_ev_c2_prob <- predict(mod_ev_c2_fit, testing, type="prob")
mod_ev_c2_pred_resp <- prediction(mod_ev_c2_prob\$Yes
,  testing\$RainTomorrow)
glm.perf.plot(mod_ev_c2_pred_resp, 0.42)
 0.8390453

``` ```mod_ev_c3_prob <- predict(mod_ev_c3_fit, testing, type="prob")
mod_ev_c3_pred_resp <- prediction(mod_ev_c3_prob\$Yes,  testing\$RainTomorrow)
glm.perf.plot(mod_ev_c3_pred_resp, 0.56)
 0.8886169

``` Conclusions

By tuning the decision threshold, we were able to improve training and testing set accuracy. It turned out the 3PM model achieved a satisfactory accuracy. ROC plots gave us an understanding of how true/false positive rates vary and what is the accuracy obtained by a specific decision threshold value (i.e. cutoff value). Ultimately, AUC values were reported.

If you have any questions, please feel free to comment below.

Related Post

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...