Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m part of a local Functional Programming Meetup group which hosts talks, but also coordinates social meetings where we discuss all sorts of FP-related topics including Haskell and other languages. We’ve started running challenges where we all solve a given problem in a language of our choosing then discuss over drinks how they compare.

This month we went with an “easy” problem with a wrinkle – we would solve the ‘strain’ exercise from Exercism (Haskell, Python – your access to these is likely conditional on you being enrolled in that language track) with an extension:

The problem is trivial; the challenge is to solve it in 4 different ways using your language of choice.

The problem itself is given as

Implement the `keep` and `discard` operation on collections. Given a collection and a predicate on the collection’s elements, `keep` returns a new collection containing those elements where the predicate is true, while `discard` returns a new collection containing those elements where the predicate is false.

For example, given the collection of numbers:

`1, 2, 3, 4, 5`

And the predicate:

“is the number even?”

Then your keep operation should produce:

`2, 4`

`1, 3, 5`

but with a restriction:

Keep your hands off that filter/reject/whatchamacallit functionality provided by your standard library! Solve this one yourself using other basic tools instead.

I figured it’s a good opportunity to write as I solve it, so here’s my R solutions.

I’ll define a test case so I can try out things as I go

```test_vec <- c(1, 2, 3, 4, 5)
test_vec
## [1] 1 2 3 4 5```

and the predicates related to ‘even’ and ‘odd’ as functions which return `TRUE` or `FALSE`

```is_even <- function(x) {
x %% 2 == 0
}

is_even(7)
## [1] FALSE
is_even(8)
## [1] TRUE
is_odd <- function(x) {
!is_even(x)
}

is_odd(7)
## [1] TRUE
is_odd(8)
## [1] FALSE```

Firstly, the restriction doesn’t seem to worry me because when I think of “filter” in R I immediately think of `dplyr::filter()` which works on `data.frame` (or `tibble`) objects, and (given the examples) we’re aiming to work with vectors (the problem is stated the same in several languages, so “collection” is a generalisation).

What about `base::Filter()`? The help states

`Filter` extracts the elements of a vector for which a predicate (logical) function gives true.

```Filter(is_even, test_vec)
## [1] 2 4
Filter(is_odd, test_vec)
## [1] 1 3 5```

Yep, that works exactly as I hoped, but is also a built-in “filter” so I can’t use it.

When I think of `keep` and `discard` I do think of the `purrr` functions, and while these, too do exactly what I want

```purrr::keep(test_vec, is_even)
## [1] 2 4
## [1] 1 3 5```

They’re in a library, so I’m going to say they don’t count.

One of the things I like about the way R does subsetting (via the square-bracket `[` which is by itself a function, but requires a matching `]` to satisfy the parser) is that you can use a logical vector to subset another vector,

```c(3, 5, 8, 12)[c(TRUE, FALSE, FALSE, TRUE)]
## [1]  3 12```

which means that if I can produce such a logical vector, say, by applying a predicate function, I can do subsetting that way

```test_vec[is_even(test_vec)]
## [1] 2 4
test_vec[is_odd(test_vec)]
## [1] 1 3 5```

Instead of using `is_odd()` I can just negate the logical vector to get the same effect

```test_vec[!is_even(test_vec)]
## [1] 1 3 5```

I can make those into functions that take a predicate and a vector

```keep_1 <- function(f, x) {
x[f(x)]
}

x[!f(x)]
}```

Testing these

```keep_1(is_even, test_vec)
## [1] 2 4
## [1] 1 3 5```

One down!

One thing to note with this approach is that R is vectorised - I’ve discussed this a few times on this blog (most recently) - which means that these predicate functions will gladly take a vector, not just a single value. This works for the `is_even()` function because inside that, the modulo operator `%%` is itself vectorised, so

```is_even(c(2, 4, 6, 9, 11, 13))
## [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE```

As I wrote in my previous post, thinking like this just becomes so natural in R that I have to force myself to remember that not every language does that.

It’s also worth mentioning that I’m passing a reference to the function `is_even` to our `keep` and `discard` functions - that is serving as our predicate because I need a way to state “is the number even?” which references the number, so I need a function. That doesn’t have to be a named function, though - it could be an “anonymous” function (a “lambda”) if I wanted

```keep_1(function(z) z %% 2 == 0, test_vec)
## [1] 2 4
discard_1(function(z) z %% 2 == 0, test_vec)
## [1] 1 3 5```

I can subset a vector with a logical vector of the same length, specifying whether or not to include that element, but I can also subset by position (keeping in mind that R is a 1-based language which means the first element is indexed by a `1` - why would any language do anything different? 😜)

```c(3, 5, 8, 12)[c(1, 4)]
## [1]  3 12```

The function `which()` takes a logical vector and returns which indices are `TRUE`

```which(c(TRUE, FALSE, FALSE, TRUE))
## [1] 1 4```

so I can use this with our predicate to keep elements

```keep_2 <- function(f, x) {
x[which(f(x))]
}

keep_2(is_even, test_vec)
## [1] 2 4```

However, discarding elements by index doesn’t use a logical negation, it uses a negative sign (`-`)

```discard_2 <- function(f, x) {
x[-which(f(x))]
}

## [1] 1 3 5```

If you look at the source of `Filter()`, you’ll see that I wasn’t far off of exactly that

```Filter
## function (f, x)
## {
##     ind <- as.logical(unlist(lapply(x, f)))
##     x[which(ind)]
## }
## <environment: namespace:base>```

but it still counts.

Another option would be to unpack the elements themselves and do some stepwise comparisons in a loop. For each element `el` in the vector `x`, test if `f(el)` is `TRUE`, and if it is, concatenate `el` to the end of the accumulating `result` vector

```keep_3 <- function(f, x) {
result <- c()
for (el in x) {
if (f(el)) {
result <- c(result, el)
}
}
result
}

keep_3(is_even, test_vec)
## [1] 2 4
result <- c()
for (el in x) {
if (!f(el)) {
result <- c(result, el)
}
}
result
}

## [1] 1 3 5```

Of course, this approach is a Bad Idea™ in general but I’m not optimising anything here. This approach does have the advantage that it isn’t relying on R’s vectorised capabilities, since each element is passed to the predicate function individually, so if I did have a non-vectorized predicate function, this would still work.

I really want a “weird” way to do this. R has plenty of weird to go around, but since I’ve been learning some Haskell, and the challenge originally referenced the Haskell solution, what if I code a Haskell-esque solution?

Haskell makes good use of recursive functions. Any loop can be written as a recursion (and vice-versa) so the previous solution is a good starting point. First, I define a base case; if I run out of numbers to process, return `NULL`. A convenient feature of R vectors is that `NULL`s are dropped

```c(1, 2, NULL, 3, 4, NULL, 5)
## [1] 1 2 3 4 5```

Otherwise, I can take the first value in the vector and test it with the predicate. If it returns `TRUE` I can append it to what I’ve calculated so far, and recursively call the function again with the rest of the vector. That could look like

```keep_4 <- function(f, x) {
if (!length(x)) return(NULL)
if (f(x[1])) {
return(c(x[1], Recall(f, x[-1])))
} else {
return(Recall(f, x[-1]))
}
}

keep_4(is_even, test_vec)
## [1] 2 4```

Some interesting points about this: the `Recall()` function is nice for defining a recursive function. I could have used `keep_4` there, but the advantage of this implementation is that I can rename the function and it still works as expected

```keep_4_also <- keep_4
rm("keep_4")

keep_4_also(is_even, test_vec)
## [1] 2 4```

If I had explicitly referenced `keep_4` inside itself, that recursion would fail with this renaming.

The negative subsetting works as described above; `x[-1]` means “not including the first element”. Lastly, testing `if (!length(x))` works because `0` can be coerced to `FALSE` and any other value to `TRUE`, so if the `length` of `x` is not `0`, this condition is met.

The discarding variant is similar, just with the two `returns()` around the other way, or

```discard_4 <- function(f, x) {
if (!length(x)) return(NULL)
if (!f(x[1])) {
return(c(x[1], Recall(f, x[-1])))
} else {
return(Recall(f, x[-1]))
}
}

## [1] 1 3 5```

There we go; 4 hand-coded implementations of `keep` and `discard` in R.

Can you think of another that doesn’t use `Filter()` or an external library? Let me know in the comments below or on Mastodon. I’m looking forward to seeing how people solved this in other languages.

devtools::session_info()
```## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  date     2023-08-30
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.17    2023-05-16 [1] CRAN (R 4.1.2)
##  bookdown      0.29    2022-09-12 [1] CRAN (R 4.1.2)
##  bslib         0.5.0   2023-06-09 [3] CRAN (R 4.3.1)
##  cachem        1.0.8   2023-05-01 [3] CRAN (R 4.3.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.6.1   2023-03-23 [3] CRAN (R 4.2.3)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.1.2)
##  digest        0.6.33  2023-07-07 [3] CRAN (R 4.3.1)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.21    2023-05-05 [3] CRAN (R 4.3.0)
##  fastmap       1.1.1   2023-02-24 [3] CRAN (R 4.2.2)
##  fs            1.6.3   2023-07-20 [3] CRAN (R 4.3.1)
##  glue          1.6.2   2022-02-24 [3] CRAN (R 4.2.0)
##  htmltools     0.5.6   2023-08-10 [3] CRAN (R 4.3.1)
##  htmlwidgets   1.5.4   2021-09-08 [1] CRAN (R 4.1.2)
##  httpuv        1.6.6   2022-09-08 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.7   2023-06-29 [3] CRAN (R 4.3.1)
##  knitr         1.43    2023-05-25 [3] CRAN (R 4.3.0)
##  later         1.3.0   2021-08-18 [1] CRAN (R 4.1.2)
##  lifecycle     1.0.3   2022-10-07 [3] CRAN (R 4.2.1)
##  magrittr      2.0.3   2022-03-30 [3] CRAN (R 4.2.0)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2)
##  pkgbuild      1.4.0   2022-11-27 [1] CRAN (R 4.1.2)
##  pkgload       1.3.0   2022-06-27 [1] CRAN (R 4.1.2)
##  prettyunits   1.1.1   2020-01-24 [3] CRAN (R 4.0.1)
##  processx      3.8.2   2023-06-30 [3] CRAN (R 4.3.1)
##  profvis       0.3.7   2020-11-02 [1] CRAN (R 4.1.2)
##  promises      1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2)
##  ps            1.7.5   2023-04-18 [3] CRAN (R 4.3.0)
##  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [3] CRAN (R 4.2.0)
##  Rcpp          1.0.9   2022-07-08 [1] CRAN (R 4.1.2)
##  remotes       2.4.2   2021-11-30 [1] CRAN (R 4.1.2)
##  rlang         1.1.1   2023-04-28 [1] CRAN (R 4.1.2)
##  rmarkdown     2.23    2023-07-01 [3] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [3] CRAN (R 4.3.1)
##  sass          0.4.7   2023-07-15 [3] CRAN (R 4.3.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
##  shiny         1.7.2   2022-07-19 [1] CRAN (R 4.1.2)
##  stringi       1.7.12  2023-01-11 [3] CRAN (R 4.2.2)
##  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.1.2)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.1.2)
##  usethis       2.1.6   2022-05-25 [1] CRAN (R 4.1.2)
##  vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.1.2)
##  xfun          0.40    2023-08-09 [3] CRAN (R 4.3.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.1.2)
##  yaml          2.3.7   2023-01-23 [3] CRAN (R 4.2.2)
##
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
##
## ──────────────────────────────────────────────────────────────────────────────```