R Boxplot

[This article was first published on R feed, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A boxplot is a graph that gives us a good indication of how the values in the data are spread out.

Box plots provide some indication of the data's symmetry and skew-ness.


Dataset to Create Boxplot

In R, first we need to load the dataset of which we want to create the boxplot of.

In this tutorial, we will be using the built-in dataset named mtcars to create a boxplot.

Let's see the first six rows of the dataset we will be using,

# use head() to load first six rows of mtcars dataset
head(mtcars)

Output

We will be creating a stripchart of this dataset.


Create boxplot in R

In R, we use the boxplot() method to create a boxplot. For example,

# boxplot for ozone reading of airquality dataset
boxplot(mtcars$mpg)

Output

In the above example, we have used the boxplot() function and the $ operator to create a boxplot of the mpg reading of the mtcars dataset.

We can pass additional parameters to control the way our plot looks.


Add Title, Label, New Color to a Boxplot in R

We can add titles, provide labels for the axes, and change the color of the boxplot in R. For example,

# add title, label, new color to boxplot
boxplot(mtcars$mpg,
  main="Mileage Data Boxplot",
  ylab="Miles Per Gallon(mpg)",
  xlab="No. of Cylinders",
  col="orange")

Output

In the above figure, we can see that we have added a title, a label to the x-axis and y-axis, and changed the color of the boxplot.

Here,

  • main – adds the title "Mileage Data Boxplot"
  • xlab – adds the label "No. of Cylinders" for x-axis
  • ylab – add the label "Miles Per Gallon(mpg)" for y-axis
  • col = "Orange" – changes the color of boxplot to orange

Boxplot Formula in R

In R, the function boxplot() can also take in formulas of the form y~x where y is a numeric vector which is grouped according to the value of x.

For example, in our dataset mtcars, the mileage per gallon mpg is grouped according to the number of cylinders cyl present in cars.

Let's take a look at example,

boxplot(mpg ~ cyl, data = mtcars,
  main = "Mileage Data Boxplot",
  ylab = "Miles Per Gallon(mpg)",
  xlab = "No. of Cylinders",
  col = "orange")

Output

In the above example, we have created a boxplot for the relation between mpg and cyl. Notice the code

boxplot(mpg ~ cyl, data = mtcars,
  ...
)

Here,

  • mpg ~ cyl – mileage per gallon mpg is grouped according to the number of cylinders cyl in cars
  • data = mtcars – data is taken from mtcars dataset

It is clear from the above figure that less number of cylinders means more mileage per gallon.


Add Notch to Boxplot in R

In R, we add a notch to boxplot to find out how the medians of different data groups match with each other. For example,

boxplot(mpg ~ cyl, data = mtcars,
  main ="Mileage Data Boxplot",
  ylab ="Miles Per Gallon(mpg)",
  xlab ="No. of Cylinders",
  col ="orange",
  notch = TRUE)

Output

In the above example, we have added notch to boxplot to find out how the medians of different data groups match with each other.

Note: If the notches overlap, we can say that the medians are equal to each other.

To leave a comment for the author, please follow the link and comment on their blog: R feed.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)