The yarrr package (0.0.8) is (finally!) on CRAN
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Great news R pirates! The yarrr package, which contains the pirateplot, has now been updated to version 0.0.8 and is up on CRAN (after hiding in plain sight on GitHub). Let’s install the latest version (0.0.8) and go over some of the updates:
install.packages("yarrr") # Install package from CRAN library("yarrr") # Load the package!
The most important function in the yarrr package is pirateplot(). What the heck is a pirateplot? A pirateplot is a modern way of visualising the relationship between a categorical independent variable, and a continuous dependent variable. Unlike traditional plotting methods, like barplots and boxplots, a pirateplot is an RDI plotting trifecta which presents Raw data (all data as points), Descriptive statistics (as a horizontal line at the mean — or any other function you wish), and Inferential statistics (95% Bayesian Highest Density Intervals, and smoothed densities).
For a full guide to the package, check out the package guide at CRAN here. For now, here are some examples of pirateplots showing off some the package updates.
Up to 3 IVs
You can now include up to three independent variables in your pirateplot. The first IV is presented as adjacent beans, the second is presented in different groups of beans in the same plot, and the third IV is shown in separate plots.
Here is a pirateplot of the heights of pirates based on three separate IVs: headband (whether the pirate wears a headband or not), sex, and eyepatch (whether the pirate wears an eye patch or not):
pirateplot(formula = height ~ sex + headband + eyepatch, point.o = .1, data = pirates)
Here, we can see that male pirates tend to be the tallest, but there there doesn’t seem to be a difference between those who wear headbands or not, and those who have eye patches or not.
New color palettes
The updated package has a few fun new color palettes contained in the piratepal() function. The first, called ‘xmen’, is inspired by my 90s Saturday morning cartoon nostalgia.
# Display the xmen palette piratepal(palette = "xmen", trans = .1, # Slightly transparent colors plot.result = TRUE)
Here, I’ll use the xmen palette to plot the distribution of the weights of chickens over time (if someone has a more suitable dataset for the xmen palette let me know!):
pirateplot(formula = weight ~ Time, data = ChickWeight, main = "Weights of chickens by Time", pal = "xmen", gl.col = "gray") mtext(text = "Using the xmen palette!", side = 3, font = 3) mtext(text = "*The mean and variance of chicken\nweights tend to increase over time.", side = 1, adj = 1, line = 3.5, font = 3, cex = .7)
The second palette called “pony” is inspired by the Bronys in our IT department.
# Display the pony palette piratepal(palette = "pony", trans = .1, # Slightly transparent colors plot.result = TRUE)
Here, I’ll plot the distribution of the lengths of movies as a function of their MPAA ratings (where G is for suitable for children, and R is suitable for adults)
pirateplot(formula= time ~ rating, data = subset(movies, time > 0 & rating %in% c("G", "PG", "PG-13", "R")), pal = "pony", point.o = .05, bean.o = 1, main = "Movie times by rating", bean.lwd = 2, gl.col = "gray") mtext(text = "Using the pony palette!", side = 3, font = 3) mtext(text = "*Movies rated for children\n(G and PG) tend to be longer \nthan those rated for adults", side = 1, adj = 1, font = 3, line = 3.5, cex = .7)
To see all of the palettes (including those inspired by movies and a transit map of Basel), just run the function with “all” as the main argument
piratepal(palette = "all")
Of course, if you find that these color palettes give you a headache, you can always set the plot to grayscale (or any other color), by specifying a single color in the palette argument. Here, I’ll create a grayscale pirateplot showing the distribution of movie budgets by their creative type:
pirateplot(formula = budget ~ creative.type, data = subset(movies, budget > 0 & creative.type %in% c("Multiple Creative Types", "Factual") == FALSE), point.o = .02, xlab = "Movie Creative Type", main = "Movie budgets (in millions) by rating", gl.col = "gray", pal = "black") mtext("Using a grayscale pirateplot", side = 3, font = 3) mtext("*Superhero movies tend to have the highest budgets\n...by far!", side = 1, adj = 1, line = 3, cex = .8, font = 3)
Looks like super hero movies have the highest budgets…by far!
Acknowledgements and Comments
– The pirateplot is largely inspired by the great beanplot package (beanplot package link) by Peter Kampstra.
– Bayesian 95% HDIs are calculated using the truly amazing BayesFactor package (BayesFactor package link) by Brian Morey
– The latest developer version of yarrr is always available at https://github.com/ndphillips/yarrr. Please post any bugs, issues, or feature requests at https://github.com/ndphillips/yarrr/issues
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.