# ggside: Plot Linear Regression using Marginal Distributions (ggplot2 extension)

**business-science.io**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This article is part of R-Tips Weekly, a weekly video tutorial that shows you step-by-step how to do common R coding tasks.

Here are the links to get set up. π

# What are Marginal Distributions?

And how can I use them to uncover complex relationships?

# What are Marginal Distributions?

**Marginal Distribution (Density) plots** are a way to extend your numeric data with side plots that highlight the density (histogram or boxplots work too).

Marginal Distribution Plots were made popular with the seaborn jointplot() side-panels in Python. These add side plots that highlight distributions.

# How do we make them in ggplot2?

**Marginal distributions can now be made in R using ggside, a new ggplot2 extension**. You can make linear regression with marginal distributions using histograms, densities, box plots, and more. Bonus – The side panels are super customizable for uncovering complex relationships.

Here are **two examples** of what you can (and will) do in this tutorial! π

### Example 1:

Linear Regression with Marginal Distribution (Density) Side-Plots (Top and Left)

### Example 2:

Facet-Plot with Marginal Box Plots (Top)

# Before we get started, get the Cheat Sheet

`ggside`

is great for making marginal distribution side plots. But, you’ll still need to learn how to visualize data with ggplot2. For those topics, I’ll use the Ultimate R Cheat Sheet to refer to `ggplot2`

code in my workflow.

### Quick Example:

Download the Ultimate R Cheat Sheet. **Then Click the “CS” next to “ggplot2”** which opens the Data Visualization with Dplyr Cheat Sheet.

Now you’re ready to quickly reference `ggplot2`

functions.

# Load Libraries & Data

The libraries we’ll need today are patchwork, ggridges, ggrepel, maps, tidyverse, and lubridate. All packages are available on CRAN and can be installed with `install.packages()`

. Note – I’m using the development version of `ggside`

, which is what I recommend in the YouTube Video .

The dataset is the mpg data that comes with ggplot2.

# SLinear Regression with Marginal Distribution Plot

Replicating Seaborn’s jointdist() plot

We’ll start by replicating what you can do in **Python’s Seaborn jointdist() Plot**. We’ll accomplish this with `ggside::geom_xsidedensity()`

### We set up the plot just like a normal ggplot.

Refer to the Ultimate R Cheat Sheet for:

`ggplot()`

`geom_point()`

`geom_smooth()`

### Next we add from ggside:

`geom_xsidedensity()`

– Adds a side density panel (top panel).`geom_ysidedensity()`

– Adds a side density panel (right panel).

The trick is using the `after_stat(density)`

, which makes an awesome looking marginal density side panel plot. I increased the size of the marginal density panels with the `theme(ggside.panel.scale.x)`

.

### Loess Regression w/ Marginal Density

We generate the regression plot with marginal distributions (density) to highlight key differences between the automobile classes. We can see:

- Pickup, SUV – Have the
**lowest**Highway Fuel Economy (MPG) - 2seater, Compact, Midsize, Subcompact – Have the
**highest**Highway Fuel Economy

# Need help learning ggplot2?

In the R for Business Analysis (DS4B 101-R) Course , I teach 5-hours just on ggplot2. Learn:

- Geometries
- Scales
- Themes
- And advanced customizations: Labeled Heat Maps and Lollipop Charts

# Plot 2. Faceted Side-Panels

Next, let’s try out some advanced functionality. I want to see how ggside handles faceted plots, which are subplots that vary based on a categorical feature. We’ll use the “cyl” column to facet, which is for engine size (number of cylinders).

### Faceted Side Panels? No problem.

Awesome! I have included facets by “cyl”, which creates four plots based on the engine size. ggside picked up on the facets and has made 4 side-panel plots.

# Amazing. ggside just works.

**Congrats. You just quickly made two report-quality plots with ggplot2 and ggside. Excellent work.**

# But it gets better

You’ve just scratched the surface.

What is the best way to become proficient in data science?

You’re probably thinking:

- There’s so much to learn.
- My time is precious.

I have good news that will put those doubts behind you.

You can learn data science with my state-of-the-art Full 5 Course R-Track System .

# Become the data science expert in your organization.

Get Five of our Premium R Courses that Build Expert-Level Machine Learning Skills, Web Application Skills, & Time Series Skills.

π Full 5 Course R-Track System

Taking these courses is equivalent to:

- 9-Months of Methodical Code-Based Learning.
- 250+ tool-based MOOC courses.
- Education Comparable to 9-months of University Courses.
- 5 end-to-end projects.
- 5 frameworks.

## Unlock the Full 5 Course R-Track System

**leave a comment**for the author, please follow the link and comment on their blog:

**business-science.io**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.