Machine Learning Results in R: one plot to rule them all!

July 18, 2018
By

(This article was first published on R Programming – DataScience+, and kindly contributed to R-bloggers)

To automate the process of modeling selection and evaluate the results with visualization, I have created some functions into my personal library and today I’m sharing the codes with you. I run them to evaluate and compare Machine Learning models as fast and easily as possible. Currently, they are designed to evaluate binary classification models results. Before we start, let me show you the final outcome so you know what we are trying to achieve here with just a simple R function:

So, let’s start!

The results object

First of all, we need to have a single list with all the results to facilitate the next steps. I am assuming on this step that you already designed a model and can calculate the predictions out of your test set. So, on my list I have the following objects:

  • Project name (i.e. Fraud Score)
  • Model (the object with our model)
  • Test Scores:
    • Index (row id, it can be a user_id, email, lead_id…)
    • Tag (known label)
    • Score (calculated with the model we are studying)
  • Datasets:
    • Train set
    • Test set
  • Parameters:
    • nfolds, ntrees, max_depth, seed, sample_rate….
  • Variable importance
  • Metrics:
    • log_loss
    • auc
  • Notes (anything you’d like to write to give you a reference later on)
  • Once we automate our results object, we can start with our beautiful plots!

    Density Plot

    I have always given importance to the density plot because it gives us visual information on skewness, distribution and our model’s facility to distinguish each class. Here we can see how the model has distributed both our categories, our whole test set and the cumulative of each category (the more separate, the better).

    mplot_density <- function(tag, score, model_name = NA, subtitle = NA, 
                              save = FALSE, file_name = "viz_distribution.png") {
      require(ggplot2)
      require(gridExtra)
    
      if (length(tag) != length(score)) {
        message("The tag and score vectors should be the same length.")
        stop(message(paste("Currently, tag has",length(tag),"rows and score has",length(score))))
      }
    
      if (length(unique(tag)) != 2) {
        stop("This function is for binary models. You should only have 2 unique values for the tag value!")
      }
    
      out <- data.frame(tag = as.character(tag),
                        score = as.numeric(score),
                        norm_score = lares::normalize(as.numeric(score)))
      
      p1 <- ggplot(out) + theme_minimal() +
        geom_density(aes(x = 100 * score, group = tag, fill = as.character(tag)), 
                     alpha = 0.6, adjust = 0.25) + 
        guides(fill = guide_legend(title="Tag")) + 
        xlim(0, 100) + 
        labs(title = "Score distribution for binary model",
             y = "Density by tag", x = "Score")
      
      p2 <- ggplot(out) + theme_minimal() + 
        geom_density(aes(x = 100 * score), 
                     alpha = 0.9, adjust = 0.25, fill = "deepskyblue") + 
        labs(x = "", y = "Density")
      
      p3 <- ggplot(out) + theme_minimal() + 
        geom_line(aes(x = score * 100, y = 100 * (1 - ..y..), color = as.character(tag)), 
                  stat = 'ecdf', size = 1) +
        geom_line(aes(x = score * 100, y = 100 * (1 - ..y..)), 
                  stat = 'ecdf', size = 0.5, colour = "black", linetype="dotted") +
        ylab('Cumulative') + xlab('') + guides(color=FALSE)
      
      if(!is.na(subtitle)) {
        p1 <- p1 + labs(subtitle = subtitle)
      }
      
      if(!is.na(model_name)) {
        p1 <- p1 + labs(caption = model_name)
      }
      
      if(save == TRUE) {
        png(file_name, height = 1800, width = 2100, res = 300)
        grid.arrange(
          p1, p2, p3, 
          ncol = 2, nrow = 2, heights = 2:1,
          layout_matrix = rbind(c(1,1), c(2,3)))
        dev.off()
      }
      
      return(
        grid.arrange(
          p1, p2, p3, 
          ncol = 2, nrow = 2, heights = 2:1,
          layout_matrix = rbind(c(1,1), c(2,3))))
      
    }
    

    Gives this plot:

    ROC Curve

    The ROC curve will give us an idea of how our model is performing with our test set. You should know by now that if the AUC is close to 50% then the model is as good as a random selector; on the other hand, if the AUC is near 100% then you have a “perfect model” (wanting or not, you must have been giving the model the answer this whole time!). So it is always good to check this plot and check that we are getting a reasonable Area Under the Curve with a nice and closed 95% confidence range.

    # ROC Curve
    mplot_roc <- function(tag, score, model_name = NA, subtitle = NA, interval = 0.2, plotly = FALSE,
    save = FALSE, file_name = "viz_roc.png") {
      require(pROC)
      require(ggplot2)
    
      if (length(tag) != length(score)) {
        message("The tag and score vectors should be the same length.")
        stop(message(paste("Currently, tag has",length(tag),"rows and score has",length(score))))
      }
    
      roc <- pROC::roc(tag, score, ci=T)
      coords <- data.frame(
        x = rev(roc$specificities),
        y = rev(roc$sensitivities))
      ci <- data.frame(roc$ci, row.names = c("min","AUC","max"))
    
      p <- ggplot(coords, aes(x = x, y = y)) +
        geom_line(colour = "deepskyblue", size = 1) +
        geom_point(colour = "blue3", size = 0.9, alpha = 0.4) +
        geom_segment(aes(x = 0, y = 1, xend = 1, yend = 0), alpha = 0.2, linetype = "dotted") + 
        scale_x_reverse(name = "% Specificity [False Positive Rate]", limits = c(1,0), 
                        breaks = seq(0, 1, interval), expand = c(0.001,0.001)) + 
        scale_y_continuous(name = "% Sensitivity [True Positive Rate]", limits = c(0,1), 
                           breaks = seq(0, 1, interval), expand = c(0.001, 0.001)) +
        theme_minimal() + 
        theme(axis.ticks = element_line(color = "grey80")) +
        coord_equal() + 
        ggtitle("ROC Curve: AUC") +
        annotate("text", x = 0.25, y = 0.10, vjust = 0, size = 4.2, 
                 label = paste("AUC =", round(100*ci[c("AUC"),],2))) +
        annotate("text", x = 0.25, y = 0.05, vjust = 0, size = 2.8, 
                 label = paste0("95% CI: ", 
                                round(100*ci[c("min"),],2),"-", 
                                round(100*ci[c("max"),],2)))
      if(!is.na(subtitle)) {
        p <- p + labs(subtitle = subtitle)
      }  
    
      if(!is.na(model_name)) {
        p <- p + labs(caption = model_name)
      }
    
      if (plotly == TRUE) {
        require(plotly)
        p <- ggplotly(p)
      }
    
      if (save == TRUE) {
        p <- p + ggsave(file_name, width = 6, height = 6)
      }
      return(p)
    }
    

    Gives this plot:

    Cuts by quantile

    If we’d have to cut the score in n equal-sized buckets, what would the score cuts be? Is the result a ladder (as it should), or a huge wall, or a valley? Is our score distribution lineal and easy to split?

    mplot_cuts <- function(score, splits = 10, subtitle = NA, model_name = NA, 
                           save = FALSE, file_name = "viz_ncuts.png") {
      
      require(ggplot2)
      
      if (splits > 25) {
        stop("You should try with less splits!")
      }
      
      deciles <- quantile(score, 
                          probs = seq((1/splits), 1, length = splits), 
                          names = TRUE)
      deciles <- data.frame(cbind(Deciles=row.names(as.data.frame(deciles)),
                                  Threshold=as.data.frame(deciles)))
      
      p <- ggplot(deciles, 
                  aes(x = reorder(Deciles, deciles), y = deciles * 100, 
                      label = round(100 * deciles, 1))) + 
        geom_col(fill="deepskyblue") + 
        xlab('') + theme_minimal() + ylab('Score') + 
        geom_text(vjust = 1.5, size = 3, inherit.aes = TRUE, colour = "white", check_overlap = TRUE) +
        labs(title = paste("Cuts by score: using", splits, "equal-sized buckets"))
      
      if(!is.na(subtitle)) {
        p <- p + labs(subtitle = subtitle)
      } 
      if(!is.na(model_name)) {
        p <- p + labs(caption = model_name)
      }
      if (save == TRUE) {
        p <- p + ggsave(file_name, width = 6, height = 6)
      }
      return(p)
    }
    

    Gives this plot:

    Split and compare quantiles

    This parameter is the easiest to sell to the C-level guys. “Did you know that with this model, if we chop the worst 20% of leads we would have avoided 60% of the frauds and only lose 8% of our sales?” That’s what this plot will give you:

    mplot_splits <- function(tag, score, splits = 5, subtitle = NA, model_name = NA, facet = NA, 
                             save = FALSE, file_name = "viz_splits.png") {
      
      require(ggplot2)
      require(dplyr)
      require(RColorBrewer)
      
      if (length(tag) != length(score)) {
        message("The tag and score vectors should be the same length.")
        stop(message(paste("Currently, tag has",length(tag),"rows and score has",length(score))))
      }
      
      if (splits > 10) {
        stop("You should try with less splits!")
      }
    
      df <- data.frame(tag, score, facet)
      npersplit <- round(nrow(df)/splits)
      names % 
        mutate(quantile = ntile(score, splits)) %>% group_by(quantile) %>%
        summarise(n = n(), 
                  max_score = round(100 * max(score), 1), 
                  min_score = round(100 * min(score), 1)) %>%
        mutate(quantile_tag = paste0(quantile," (",min_score,"-",max_score,")"))
      
      p % 
        mutate(quantile = ntile(score, splits)) %>% 
        group_by(quantile, facet, tag) %>% tally() %>%
        ungroup() %>% group_by(facet, tag) %>% 
        arrange(desc(quantile)) %>%
        mutate(p = round(100*n/sum(n),2),
               cum = cumsum(100*n/sum(n))) %>%
        left_join(names, by = c("quantile")) %>%
        ggplot(aes(x = as.character(tag), y = p, label = as.character(p),
                   fill = as.character(quantile_tag))) + theme_minimal() +
        geom_col(position = "stack") +
        geom_text(size = 3, position = position_stack(vjust = 0.5), check_overlap = TRUE) +
        xlab("Tag") + ylab("Total Percentage by Tag") +
        guides(fill = guide_legend(title=paste0("~",npersplit," p/split"))) +
        labs(title = "Tag vs Score Splits Comparison") +
        scale_fill_brewer(palette = "Spectral")
      if(!is.na(subtitle)) {
        p <- p + labs(subtitle = subtitle)
      }  
      if(!is.na(model_name)) {
        p <- p + labs(caption = model_name)
      }
      if(!is.na(facet)) {
        p <- p + facet_grid(. ~ facet, scales = "free")
      }  
      if (save == TRUE) {
        p <- p + ggsave(file_name, width = 6, height = 6)
      }
      return(p)
    }
    

    Gives this plot:

    Finally, let’s plot our results

    Once we have defined these functions above, we can create a new one that will bring everything together into one single plot. If you pay attention to the variables needed to create this dashboard you would notice it actually only needs two: the label or tag, and the score. You can customize the splits for the upper right plot, set a subtitle, define the model’s name, save it in a new folder, change the image’s name.

    mplot_full <- function(tag, score, splits = 8, subtitle = NA, model_name = NA, 
                           save = FALSE, file_name = "viz_full.png", subdir = NA) {
      require(ggplot2)
      require(gridExtra)
      options(warn=-1)
    
      if (length(tag) != length(score)) {
        message("The tag and score vectors should be the same length.")
        stop(message(paste("Currently, tag has",length(tag),"rows and score has",length(score))))
      }
    
      p1 <- mplot_density(tag = tag, score = score, subtitle = subtitle, model_name = model_name)
      p2 <- mplot_splits(tag = tag, score = score, splits = splits)
      p3 <- mplot_roc(tag = tag, score = score)
      p4 <- mplot_cuts(score = score)
    
      if(save == TRUE) {
        if (!is.na(subdir)) {
          dir.create(file.path(getwd(), subdir))
          file_name <- paste(subdir, file_name, sep="/")
        }
        png(file_name, height = 2000, width = 3200, res = 300)
        grid.arrange(
          p1, p2, p3, p4,
          widths = c(1.3,1),
          layout_matrix = rbind(c(1,2), c(1,2), c(1,3), c(4,3)))
        dev.off()
      }
      return(
        grid.arrange(
          p1, p2, p3, p4,
          widths = c(1.3,1),
          layout_matrix = rbind(c(1,2), c(1,2), c(1,3), c(4,3)))
      ) 
    }
    

    That’s it. This dashboard will give us almost everything we need to visually evaluate our model’s performance into the test set.

    One bonus tip for these plots: you can set the subtitle and subdirectory before you plot everything so you don’t have to change it whenever you are trying a new model.

    subtitle <- paste(results$project, "- AUC:", round(100 * results$auc_test, 2))
    subdir <- paste0("Models/", round(100*results$auc_test, 2), "-", results$model_name)
    

    Bonus: Variables Importance

    If you are working with a ML algorithm that let’s you see the importance of each variable, you can use the following function to see the results:

    mplot_importance <- function(var, imp, colours = NA, limit = 15, model_name = NA, subtitle = NA,
                                 save = FALSE, file_name = "viz_importance.png", subdir = NA) {
      
      require(ggplot2)
      require(gridExtra)
      options(warn=-1)
      
      if (length(var) != length(imp)) {
        message("The variables and importance values vectors should be the same length.")
        stop(message(paste("Currently, there are",length(var),"variables and",length(imp),"importance values!")))
      }
      if (is.na(colours)) {
        colours <- "deepskyblue" 
      }
      out <- data.frame(var = var, imp = imp, Type = colours)
      if (length(var) < limit) {
        limit <- length(var)
      }
      
      output <- out[1:limit,]
      
      p <- ggplot(output, 
                  aes(x = reorder(var, imp), y = imp * 100, 
                      label = round(100 * imp, 1))) + 
        geom_col(aes(fill = Type), width = 0.1) +
        geom_point(aes(colour = Type), size = 6) + 
        coord_flip() + xlab('') + theme_minimal() +
        ylab('Importance') + 
        geom_text(hjust = 0.5, size = 2, inherit.aes = TRUE, colour = "white") +
        labs(title = paste0("Variables Importances. (", limit, " / ", length(var), " plotted)"))
      
      if (length(unique(output$Type)) == 1) {
        p <- p + geom_col(fill = colours, width = 0.2) +
          geom_point(colour = colours, size = 6) + 
          guides(fill = FALSE, colour = FALSE) + 
          geom_text(hjust = 0.5, size = 2, inherit.aes = TRUE, colour = "white")
      }
      if(!is.na(model_name)) {
        p <- p + labs(caption = model_name)
      }
      if(!is.na(subtitle)) {
        p <- p + labs(subtitle = subtitle)
      }  
      if(save == TRUE) {
        if (!is.na(subdir)) {
          dir.create(file.path(getwd(), subdir))
          file_name <- paste(subdir, file_name, sep="/")
        }
        p <- p + ggsave(file_name, width=7, height=6)
      }
      
      return(p)
      
    }
    

    Gives this plot:

    Hope you guys enjoyed this post and any further comments or suggestions are more than welcome. Not a programmer here but I surely enjoy sharing my code and ideas! Feel free to connect with me in LinkedIn and/or write below in the comments.

      Related Post

      1. Seaborn Categorical Plots in Python
      2. Matplotlib Library Tutorial with Examples – Python
      3. Visualize the World Cup with R! Part 1: Recreating Goals with ggsoccer and ggplot2
      4. Creating Slopegraphs with R
      5. How to use paletteR to automagically build palettes from pictures

      To leave a comment for the author, please follow the link and comment on their blog: R Programming – DataScience+.

      R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



      If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

      Comments are closed.

      Search R-bloggers


      Sponsors

      Never miss an update!
      Subscribe to R-bloggers to receive
      e-mails with the latest R posts.
      (You will not see this message again.)

      Click here to close (This popup will not appear again)