One of the Best and Most Underutilized Graphs in ggplot2

March 15, 2016
By

(This article was first published on Hallway Mathlete, and kindly contributed to R-bloggers)


Understanding how a distribution of a variable changes over time can make a great visualization. These highly intuitive graphics can display a lot of information and can be simply rendered in R using ggplot2. However, based on my experience, they are one of the most underutilized graphs in R.

A good example of this style of graph is from my research. My research studies how data analysis can be utilized to improve the product design and manufacturing process. The style of graph discussed in this post is extremely useful for showing how design specifications change over time. Below you can see an example of how the specifications of secondary cameras on cellphones has changed over time.  It is easily seen that before 2011, there was almost no secondary cameras and by 2015, almost all cameras released had some form of secondary camera.


To create these plots, first lets load ggplot2 and the diamond data set.

library(ggplot2)
data(diamonds)
head(diamonds)


When creating these plots, I like to make sure I under stand how the data is distributed over the x axis. This is helpful because if there is a section of x-axis with much fewer data points, the distribution of the y-axis can change rapidly over the x-axis due to low samples.

The plot below shows the distribution of diamonds grouped by cut as the price changes.

ggplot(data=diamonds, aes(x=price, group=cut, fill=cut, position="stack")) + 
geom_density(adjust=1.5)
In the next plots instead of the count in the y-axis, the y-axis is the percent of each group (cut for the first example and clarity for the second) for different prices.
ggplot(data=diamonds,aes(x=price, group=cut, fill=cut, position="stack")) + 
geom_density(adjust=1.5, position="fill")
ggplot(data=diamonds,aes(x=price, group=clarity, fill=clarity, position="stack")) +
 geom_density(adjust=1.5, position="fill")


To leave a comment for the author, please follow the link and comment on their blog: Hallway Mathlete.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)