Build and Evaluate A Logistic Regression Classifier

[This article was first published on business-science.io, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This article is part of a R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks.


Logistic regression is a simple, yet powerful classification model. In this tutorial, learn how to build a predictive classifier that classifies the age of a vehicle. Then use ggplot to tell the story!

Here are the links to get set up. ????

The Story

In this analysis we learn that newer vehicles are MORE EFFICIENT, and we’ll make a data visualization that tells the story.

(Click image to play tutorial)

How did we make this plot?

  1. Our logistic regression classifier modeled the data
  2. We used VIP to find the most important features
  3. We visualized with ggplot ????

Making a Logistic Regression Classifier

Logistic regression is a must-know tool in your data science arsenal.

  • Logistic Regression is easy to explain
  • The classifier has no tuning parameters (no knobs that need adjusted)

Simply split our dataset, train on the training set, evaluate on the testing set.

Folks, it’s that simple. ????

Full code in the video Github Repository

Evaluating Our Classification Model

Question: How do we know our if our model is good?
Answer: Area Under the Curve (AUC)!

About AUC:

  • Simple measure.
  • We want greater than 0.5.
  • Closer to 1.0, the better our model is.
  • Bonus: ROC Plot – A way to visualize the AUC.


Full code in the video Github Repository

Telling the Story

What can we do with a Logistic Regression Classifier? Let’s develop a story to communicate our insight!


1. First, find the most important features (predictors) using vip().


Full code in the video Github Repository


2. Next, use ggplot() to make a visualization that focuses on the top features:

  • HWY: The highway fuel economy (miles per gallon)
  • CLASS: The Vehicle Class (e.g. pickup, subcompact, SUV)


Full code in the video Github Repository

What did we learn using Logistic Regression?

It’s clear now:

  • Vehicles have become more efficient over time.
  • Highway fuel economy has gone up for every single class of vehicle.


Your story-telling skills are amazing. Santa approves. ????


But if you really want to improve your productivity…

Here’s how to master R programming and become powered by R. ????

What happens after you learn R for Business.

Your Job Performance Review after you’ve launched your first Shiny App. ????

This is career acceleration.


SETUP R-TIPS WEEKLY PROJECT

  1. Sign Up to Get the R-Tips Weekly (You’ll get email notifications of NEW R-Tips as they are released): https://mailchi.mp/business-science/r-tips-newsletter

  2. Set Up the GitHub Repo: https://github.com/business-science/free_r_tips

  3. Check out the setup video (https://youtu.be/F7aYV0RPyD0). Or, Hit Pull in the Git Menu to get the R-Tips Code

Once you take these actions, you’ll be set up to receive R-Tips with Code every week. =)




To leave a comment for the author, please follow the link and comment on their blog: business-science.io.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)