ggplot2: Waterfall Charts

May 10, 2010
By

[This article was first published on Learning R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Waterfall charts are often used for analytical purposes in the business setting to show the effect of sequentially introduced negative and/or positive values. Sometimes waterfall charts are also referred to as cascade charts.

In the next few paragraphs I will show how to plot a waterfall chart using ggplot2.


Data

A very small fictional dataset depicting the changes to a company cash position, found in a blogpost showing how to prepare a waterfall chart in Tableau.

> balance <- data.frame(desc = c("Starting Cash",
+     "Sales", "Refunds", "Payouts", "Court Losses",
+     "Court Wins", "Contracts", "End Cash"), amount = c(2000,
+     3400, -1100, -100, -6600, 3800, 1400, 2800))
> balance
           desc amount
1 Starting Cash   2000
2         Sales   3400
3       Refunds  -1100
4       Payouts   -100
5  Court Losses  -6600
6    Court Wins   3800
7     Contracts   1400
8      End Cash   2800

In order to preserve the order of the lines in a dataframe I convert the desc variable to a factor; id and type variable are also added:

> balance$desc <- factor(balance$desc, levels = balance$desc)
> balance$id <- seq_along(balance$amount)
> balance$type <- ifelse(balance$amount > 0, "in",
+     "out")
> balance[balance$desc %in% c("Starting Cash", "End Cash"),
+     "type"] <- "net"

Next the data will be slightly reworked to specify the coordinates for drawing the waterfall bars.

> balance$end <- cumsum(balance$amount)
> balance$end <- c(head(balance$end, -1), 0)
> balance$start <- c(0, head(balance$end, -1))
> balance <- balance[, c(3, 1, 4, 6, 5, 2)]
> balance
  id          desc type start   end amount
1  1 Starting Cash  net     0  2000   2000
2  2         Sales   in  2000  5400   3400
3  3       Refunds  out  5400  4300  -1100
4  4       Payouts  out  4300  4200   -100
5  5  Court Losses  out  4200 -2400  -6600
6  6    Court Wins   in -2400  1400   3800
7  7     Contracts   in  1400  2800   1400
8  8      End Cash  net  2800     0   2800

Plotting

Now everything is set to plot the first waterfall chart. geom_rect is used to draw the rectangles using the coordinates calculated in the previous step.

> library(ggplot2)
> ggplot(balance, aes(desc, fill = type)) + geom_rect(aes(x = desc,
+     xmin = id - 0.45, xmax = id + 0.45, ymin = end,
+     ymax = start))
waterfall-007.png

The fill mapping could use some tweaking (my preference is to have outflows in red, inflows in green, and net position in blue), for that I change the order of the underlying factor levels.

> balance$type <- factor(balance$type, levels = c("out",
+     "in", "net"))

Almost ready, one more tweak to the x-axis labels: the helper function below replaces spaces with new lines, making the labels more readable.

> strwr <- function(str) gsub(" ", "\n", str)
> (p1 <- ggplot(balance, aes(fill = type)) + geom_rect(aes(x = desc,
+     xmin = id - 0.45, xmax = id + 0.45, ymin = end,
+     ymax = start)) + scale_y_continuous("", formatter = "comma") +
+     scale_x_discrete("", breaks = levels(balance$desc),
+         labels = strwr(levels(balance$desc))) +
+     opts(legend.position = "none"))
waterfall-011.png

Finally, the bar labels are also added (the conditional positioning of them is quite a lengthy process, as you can see).

> p1 + geom_text(subset = .(type == "in"), aes(id,
+     end, label = comma(amount)), vjust = 1, size = 3) +
+     geom_text(subset = .(type == "out"), aes(id,
+         end, label = comma(amount)), vjust = -0.3,
+         size = 3) + geom_text(data = subset(balance,
+     type == "net" & id == min(id)), aes(id, end,
+     colour = type, label = comma(end), vjust = ifelse(end <
+         start, 1, -0.3)), size = 3.5) + geom_text(data = subset(balance,
+     type == "net" & id == max(id)), aes(id, start,
+     colour = type, label = comma(start), vjust = ifelse(end <
+         start, -0.3, 1)), size = 3.5)
waterfall-013.png

To leave a comment for the author, please follow the link and comment on their blog: Learning R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)