Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A monad is a very useful pattern. If you don’t know what it is, then read my post from a few years ago, watch this excellent video, and you can also read J. Carroll’s insightful post. Once you think you get the concept, this blog post will be much more enjoyable. That’s because I’d like to focus on why monads don’t seem to want to spread in the R language.
After all, R is a functional programming language, and it would be natural for monads to be part of the basic toolkit, like in Haskell or Rust, yet this isn’t the case. There have been and are attempts: for example the {chronicler} or the {maybe} package, and perhaps more. But I’ve never seen anyone use them in production, at least not in my field, bioinformatics. In this post, I’d like to think through why this is the case.
What does a monad need?
- A data type that stores “plain” types (duh…)
- A function that wraps data into the monad (sometimes called
return
orpure
) - A composition operator (
bind
,>>=
) that allows us to chain smaller steps into a process even in the monad realm - Optionally, an unwrapping function that extracts values from the container
- It’s also useful to have functions that use the monad data type. 🙂
So there’s the base R realm, where there are plain functions, and we chain
computations (i.e. compose functions) with the pipe operator (|>
, %>%
), or
with parentheses. And there are the various monad realms: Maybe, List,
Logger/Writer, Either, Promise, where we use the bind operator for the same
purpose. Once we enter such a monad realm, the pipe operator and usual
composition won’t work until we return to the base world. This is because
monadic functions use plain types as input, and produce monad types, so you
can’t simply pipe it into the next function.
In most other programming languages there are built-in, basic data types: Int,
Char, etc., and the various monads are derived from these. R is not a
conventional language in this respect, as it doesn’t have scalar types, only
vectors! The expression a <- 3
actually assigns a vector of length 1 to
variable a
, not a double type. This means that in R, the base realm is already
a monad. Here, it’s called a vector, the equivalent of list in other
languages.
Not only that, we’re actually dealing with a monad combination, since every
vector type can have missing values! I think everyone is familiar with
NA_string_
NA_integer_
et al. So “Hello world!” would be a String type in
other languages, but in R it’s basically [ Maybe String ]
by default – a
vector of optional strings. For numeric data types, NaN
is added to this, which
can be simply interpreted as NA
, but differs from it, its type could be
written as [ NaN | NA | Numeric ]
.
In light of this, maybe there isn’t much point in implementing List and Maybe monads, since these are present by default, just in a slightly unusual way not as monads, but in the form of basic data types.
Now that I’m enlightened, I am wondering if one could make use of the fact that we are already in a monad realm. Maybe we could simplify our code, maybe not. That should be a topic of another blog post.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.