Monads everywhR

R on Biofunctor

14 hours ago

[This article was first published on R on Biofunctor, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A monad is a very useful pattern. If you don’t know what it is, then read my post from a few years ago, watch this excellent video, and you can also read J. Carroll’s insightful post. Once you think you get the concept, this blog post will be much more enjoyable. That’s because I’d like to focus on why monads don’t seem to want to spread in the R language.

After all, R is a functional programming language, and it would be natural for monads to be part of the basic toolkit, like in Haskell or Rust, yet this isn’t the case. There have been and are attempts: for example the {chronicler} or the {maybe} package, and perhaps more. But I’ve never seen anyone use them in production, at least not in my field, bioinformatics. In this post, I’d like to think through why this is the case.

What does a monad need?

A data type that stores “plain” types (duh…)
A function that wraps data into the monad (sometimes called return or pure)
A composition operator (bind, >>=) that allows us to chain smaller steps into a process even in the monad realm
Optionally, an unwrapping function that extracts values from the container
It’s also useful to have functions that use the monad data type. 🙂

So there’s the base R realm, where there are plain functions, and we chain computations (i.e. compose functions) with the pipe operator (|>, %>%), or with parentheses. And there are the various monad realms: Maybe, List, Logger/Writer, Either, Promise, where we use the bind operator for the same purpose. Once we enter such a monad realm, the pipe operator and usual composition won’t work until we return to the base world. This is because monadic functions use plain types as input, and produce monad types, so you can’t simply pipe it into the next function.

In most other programming languages there are built-in, basic data types: Int, Char, etc., and the various monads are derived from these. R is not a conventional language in this respect, as it doesn’t have scalar types, only vectors! The expression a <- 3 actually assigns a vector of length 1 to variable a, not a double type. This means that in R, the base realm is already a monad. Here, it’s called a vector, the equivalent of list in other languages.

Not only that, we’re actually dealing with a monad combination, since every vector type can have missing values! I think everyone is familiar with NA_string_ NA_integer_ et al. So “Hello world!” would be a String type in other languages, but in R it’s basically [ Maybe String ] by default – a vector of optional strings. For numeric data types, NaN is added to this, which can be simply interpreted as NA, but differs from it, its type could be written as [ NaN | NA | Numeric ].

In light of this, maybe there isn’t much point in implementing List and Maybe monads, since these are present by default, just in a slightly unusual way not as monads, but in the form of basic data types.

Now that I’m enlightened, I am wondering if one could make use of the fact that we are already in a monad realm. Maybe we could simplify our code, maybe not. That should be a topic of another blog post.

To leave a comment for the author, please follow the link and comment on their blog: R on Biofunctor.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Related