August 9, 2017
By

(This article was first published on Random R Ramblings, and kindly contributed to R-bloggers)

## Introduction

I was recently introduced to the Twitter Waterflow Problem and I decided it was interesting enough to try and complete the challenge in R. Consider the following picture: This plot shows a series of walls and empty valleys. We can represent this picture by an array of integers, where the value at each index is the height of the wall. So in this case, the array of integers can be defined as:

``wall <- c(2, 5, 1, 2, 3, 4, 7, 7, 6)``

Now imagine it rains. How much water is going to be accumulated in puddles between walls? No puddles are formed at edges of the wall and water is considered to simply run off the edge. We count volume in square blocks of 1×1. Thus, we are left with a puddle between column 2 and column 7 and the volume is 10.

## The Approach

The approach I took was one of many ways you could solve this problem. I chose to treat each column (wall) as an index and then for each index I implement a loop:

• Find the height of the current index
• Find the maximum height of the walls to the left of the current index
• Find the maximum height of the walls to the right of the index

I then work out what the smallest maximum height is between the maximum heights to the left and right of the current index. If this smallest height minus the current index height is greater than zero, then I know how many blocks will fill with water for the current index. Of course, if the smallest maximum height to the left or right is less than the current height, then I get the run off. Converting this to code looks like this:

``````len <- length(wall)
# pre-allocate memory to make the loop more efficient
water <- rep(0, len)
for (i in seq_along(wall)) {
currentHeight <- wall[i]
maxLeftHeight <- if (i > 1) {
max(wall[1:(i - 1)])
} else {
0
}
maxRightHeight <- if (i == len) {
0
} else {
max(wall[(i + 1):len])
}
smallestMaxHeight <- min(maxLeftHeight, maxRightHeight)
water[i] <- if (smallestMaxHeight - currentHeight > 0) {
smallestMaxHeight - currentHeight
} else {
0
}
}
water
#  0 0 4 3 2 1 0 0 0``````

This returns a vector of the number of blocks of water at each index.

## The R6 Class

For this problem I chose to use the `R6` class system because it is very self contained. The `R6` class system is different from the functional `S3` and `S4` class systems found in base R in that it is an encapsulated class system. Some key differences between the two are:

Functional:

• Objects contain data
• Class methods are separate from objects
• Objects are immutable – they cannot be changed after they have been created

Encapsulated:

• Objects contain methods and data
• Objects are mutable

Winston Chang details these differences very well in his userR!2017 talk.

I wanted the user to be able to acquire two key pieces of information: the total blocks of water and the filled plot. I therefore created three `public` methods inside an `R6` class and placed the class in a simple package which can be found here. `public` methods are a list of functions (and/or non-functions) which are essentially methods (or data) of the class that are intended to be used by the user of the class.

When writing a new `R6` class, we often want to perform some initial functionality when we instantiate an object of this class. This is true in our case as when we instantiate a new object of class `waterflow`, we want it to calculate the heights of the water straight away. We therefore call our earlier function `initialize` and place it inside the `public` list.

``````initialize = function(wall = NULL) {
if (is.null(wall)) {
}
if (!is.numeric(wall)) {
}
len <- length(wall)
water <- rep(0, len)
for (i in seq_along(wall)) {
currentHeight <- wall[i]
maxLeftHeight <- if (i > 1) {
max(wall[1:(i - 1)])
} else {
0
}
maxRightHeight <- if (i == len) {
0
} else {
max(wall[(i + 1):len])
}
smallestMaxHeight <- min(maxLeftHeight, maxRightHeight)
water[i] <- if (smallestMaxHeight - currentHeight > 0) {
smallestMaxHeight - currentHeight
} else {
0
}
}
private\$heights <- private\$tidyWater(water, wall)
}``````

Note, we have one key difference here, we are assigning the results to members of the `private` list within our object. Once we have calculated our `water` heights, we use them along with our `wall` heights in the function `private\$tidyWater()` and assign the resulting `data.frame` to `private\$heights`. The `private` argument is a list which contains methods (and/or data) which are internal to the class and are not intended to be used by the user. We do similar things when writing packages – we explicitly export the functionality we want other people to use and don’t export functionality that is only used within the package itself.

So to use the class, first we instantiate the class with some data in a call to `\$new()` which in turn runs the above `initialize` function.

``````library(waterflow)
wall <- c(2, 5, 1, 2, 3, 4, 7, 7, 6)
p <- waterflow\$new(wall)``````

So now we have an object called `p` which is of class `waterflow` (and `R6`). `p` contains the data, as well as the (`public`) methods we can perform on that data.

Typically when we return `p`, `R6` objects have a default `print` method that lists all members of the object but here there is a custom `\$print()` function that returns `heights`.

``````p
#    pos  type val
# 1    1 water   0
# 2    2 water   0
# 3    3 water   4
# 4    4 water   3
# 5    5 water   2
# 6    6 water   1
# 7    7 water   0
# 8    8 water   0
# 9    9 water   0
# 10   1  wall   2
# 11   2  wall   5
# 12   3  wall   1
# 13   4  wall   2
# 14   5  wall   3
# 15   6  wall   4
# 16   7  wall   7
# 17   8  wall   7
# 18   9  wall   6``````

We can still return the members of the object by looking at the structure of `p`.

``````str(p)
# Classes 'waterflow', 'R6'
#   Public:
#     clone: function (deep = FALSE)
#     initialize: function (wall = NULL)
#     plot: function ()
#     print: function ()
#     total: function ()
#   Private:
#     heights: data.frame
#     tidyWater: function (water, wall)``````

To calculate the total water that fills the valleys, we sum over the `heights` object for the values of the `water`.

``````total = function() {
sum(private\$heights[private\$heights\$type %in% "water", "val"])
}``````

Calling the `public` method `\$total()` gives the expected result.

``````p\$total()
#  10``````

To call the plot, we access the `public` method `\$plot()`.

``p\$plot()`` The completed class looks like this

``````waterflow <- R6Class(
"waterflow",
public = list(
initialize = function(wall = NULL) {
if (is.null(wall)) {
}
if (!is.numeric(wall)) {
}
len <- length(wall)
water <- rep(0, len)
for (i in seq_along(wall)) {
currentHeight <- wall[i]
maxLeftHeight <- if (i > 1) {
max(wall[1:(i - 1)])
} else {
0
}
maxRightHeight <- if (i == len) {
0
} else {
max(wall[(i + 1):len])
}
smallestMaxHeight <- min(maxLeftHeight, maxRightHeight)
water[i] <- if (smallestMaxHeight - currentHeight > 0) {
smallestMaxHeight - currentHeight
} else {
0
}
}
private\$heights <- private\$tidyWater(water, wall)
},
plot = function() {
ggplot(private\$heights) +
geom_col(
aes(x = pos + 1 / 2, y = val, fill = type),
width = 1, show.legend = FALSE
) +
scale_fill_manual(values = c("dodgerblue2", "grey50")) +
scale_x_continuous(breaks = seq(0, max(private\$heights\$pos), 1)) +
theme(
panel.background = element_blank(),
panel.ontop = TRUE,
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_line(colour = "white", size = 0.5),
panel.grid.major.x = element_line(colour = "white", size = 0.5),
panel.grid.major.y = element_line(colour = "white", size = 0.5),
axis.ticks = element_blank(),
text = element_blank()
)
},
print = function() print(private\$heights),
total = function() {
sum(private\$heights[private\$heights\$type %in% "water", "val"])
}
),
private = list(
heights = NULL,
tidyWater = function(water, wall) {
data.frame(
pos = seq_along(wall),
type = factor(
rep(c("water", "wall"), each = length(wall)),
levels = c("water", "wall")
),
val = c(water, wall)
)
}
)
)``````

Overall `R6` is a great package with one or two downsides but these are mostly not too worrying. For one thing, `R6` is a separate package that users have to load or use as an import but given that it is such a lightweight package and doesn’t have any imports of its own, it’s nothing to be concerned about. Also, `R6` doesn’t have any strict type checking but this is remedied by including your own type checking in your methods. The main bug bear I have with `R6` is the lack of support from `Roxygen2` out of the box and this has been an open issue for a while.

Also posted on the Mango Solutions blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...