# Mastering Data Visualization in R: Plotting Predicted Values with the mtcars Dataset

**Steve's Data Tips and Tricks**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Introduction

Data visualization is a powerful tool in a data scientist’s toolkit. It not only helps us understand our data but also presents it in a way that is easy to comprehend. In this blog post, we will explore how to plot predicted values in R using the mtcars dataset. We will train a simple regression model to predict the miles per gallon (mpg) of cars based on their attributes and then visualize the predictions. By the end of this tutorial, you’ll have a clear understanding of how to plot predicted values and can apply this knowledge to your own data analysis projects.

**Step 1: Load the Required Libraries**

Before we dive into the code, let’s make sure we have the necessary libraries installed. We’ll be using `ggplot2`

for plotting and `caret`

for model training and evaluation. You can install them if you haven’t already using:

install.packages("ggplot2") install.packages("caret")

Now, let’s load the libraries:

library(ggplot2) library(caret)

**Step 2: Load and Explore the Data**

We’ll use the classic `mtcars`

dataset, which contains various attributes of different car models. Our goal is to predict the fuel efficiency (mpg) of these cars. Let’s load and explore the dataset:

head(mtcars)

mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

This will display the first few rows of the dataset, giving you an idea of what it looks like.

**Step 3: Split the Data into Training and Testing Sets**

Before we proceed with modeling and prediction, we need to split our data into training and testing sets. We’ll use 80% of the data for training and the remaining 20% for testing:

set.seed(123) # for reproducibility splitIndex <- createDataPartition(mtcars$mpg, p = 0.8, list = FALSE) training_data <- mtcars[splitIndex, ] testing_data <- mtcars[-splitIndex, ]

**Step 4: Build a Simple Linear Regression Model**

Now, let’s build a simple linear regression model to predict `mpg`

based on other attributes. We’ll use the `lm()`

function:

model <- lm(mpg ~ ., data = training_data)

This line of code fits the linear regression model using the training data.

**Step 5: Make Predictions**

With our model trained, we can now make predictions on the testing data:

predictions <- predict(model, newdata = testing_data)

**Step 6: Create a Scatter Plot of Predicted vs. Actual Values**

The most exciting part is visualizing the predicted values. We can do this using a scatter plot. Let’s create one:

# Combine actual and predicted values plot_data <- data.frame(Actual = testing_data$mpg, Predicted = predictions) # Create a scatter plot ggplot(plot_data, aes(x = Actual, y = Predicted)) + geom_point() + geom_smooth(method = "lm", formula = y ~ x, color = "red") + labs( x = "Actual MPG", y = "Predicted MPG", title = "Actual vs. Predicted MPG" ) + theme_minimal()

This code generates a scatter plot with the actual MPG values on the x-axis and predicted MPG values on the y-axis. The red line represents a linear regression line that helps us see how well our predictions align with the actual data.

Here is how we also plot the data in base R.

# Combine actual and predicted values plot_data <- data.frame(Actual = testing_data$mpg, Predicted = predictions) # Create a scatter plot plot(plot_data$Actual, plot_data$Predicted, xlab = "Actual MPG", ylab = "Predicted MPG", main = "Actual vs. Predicted MPG", pch = 19, col = "blue") # Add a regression line abline(lm(Predicted ~ Actual, data = plot_data), col = "red")

# Conclusion

Congratulations! You’ve successfully learned how to plot predicted values in R using the mtcars dataset. Visualization is a vital part of data analysis, and it can provide valuable insights into the performance of your predictive models.

I encourage you to try this on your own datasets and explore more advanced visualization techniques. Experiment with different models and datasets to gain a deeper understanding of data visualization in R. Happy coding!

**leave a comment**for the author, please follow the link and comment on their blog:

**Steve's Data Tips and Tricks**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.