# Taking from Infinite Sequences

**rstats on Irregularly Scheduled Programming**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

One thing that has really caught my attention as I learn more programming
languages is the idea of *generators* or infinite sequences of values. Yes, *infinite*.
Coming from R, that seems unlikely, but in at least several other languages,
it’s entirely possible thanks to iterators and lazy evaluation.

I saw this video which solves a codewars challenge using an infinite list, which references this one on the same topic.

First, a diversion into recursion

In Haskell, one of the first exercises after “Hello, World!” people discover is the Fibonacci sequence, where the \(n^{\rm th}\) value is given by the sum of the two previous values. As a function, this can be written as

λ> fib 0 = 0 λ> fib 1 = 1 λ> fib n = fib (n-1) + fib (n-2)

Essentially, `fib(0)`

returns `0`

. `fib(1)`

returns `1`

, and for any value `n`

it
returns the (recursively defined) sum of the two previous values. This isn’t infinite
at all…

\[ \begin{align*} {\rm fib}(4) &= {\rm fib}(3) + {\rm fib}(2)\\ &= ({\rm fib}(2) + {\rm fib}(1)) + ({\rm fib}(1) + {\rm fib}(0))\\ &= {\rm fib}(1) + {\rm fib}(0) + {\rm fib}(1) + {\rm fib}(1) + {\rm fib}(0)\\ &= 1 + 0 + 1 + 1 + 0\\ &= 3 \end{align*} \]

We could write that in R as

fib <- function(n) { if (n == 0) return(0) if (n == 1) return(1) fib(n - 1) + fib(n - 2) } fib(4) ## [1] 3 fib(8) ## [1] 21

This may come as a surprise to some - the function is defined *in terms of itself* and
some base cases. This is entirely fine in R and Haskell as they’re lazily evaluated -
nothing happens until a value is actually used. In R, this means that if an argument
to a function isn’t used, it’s not evaluated at all

loud_func <- function() { cat("HELLLOOOOO!!!\n") } stays_quiet <- function(f, g = loud_func()) { f(c(1, 2, 3, 4)) } stays_quiet(f = mean) ## [1] 2.5

Notice that although the argument `g`

is the *invocation* of `loud_func()`

, it’s never
evaluated because we don’t use it. If we did use it…

noisy <- function(f, g = loud_func()) { g f(c(1, 2, 3, 4)) } noisy(sum) ## HELLLOOOOO!!! ## [1] 10

So, in these languages, we can define a function recursively. Given the base case(s), these will eventually return just a number, so the computation will complete.

Instead of just adding the values together, we can create a sequence of values by concatenating the iterations together. Starting with data and working down to a base case is called “recursion”, while starting from a base case and building up a data structure is “corecursion”.

If we want a sequence of values that represents the Fibonacci numbers, we can use

fibs <- function(n) { if (n == 0) return(0) if (n == 1) return(1) prev <- fibs(n - 1) c(prev, sum(tail(prev, 2))) } fibs(10) ## [1] 1 1 2 3 5 8 13 21 34 55

What’s blown my mind recently is the concept of *infinite* data structures. If I
defined some function that, instead of working *down* to a base case, just kept
expanding, say, by concatenating with a *larger* number (corecursion), such as

inf_series <- function(x) { c(x, inf_series(x + 1)) }

Then, if I tried to evaluate `inf_series(5)`

this would produce

c(5, inf_series(6))

which would expand to

c(5, 6, inf_series(7))

and so on… forever. Of course, R can’t keep going forever

inf_series(5) Error: C stack usage 7971732 is too close to the limit

This error comes about because each function in R is a “closure” which “encloses” its environment. The way R keeps track of that (and where it needs to return after returning from a function) is by adding “stack frames” each time it dives deeper into a function calling a function. We’ve asked it to add infinity of these, so at some point it says “too many”.

Okay, so, not possible, right?

In Haskell I can define

λ> fibs = 0 : scanl (+) 1 fibs

which is a concatenation (`:`

) of `0`

with the result of `scanl (+) 1 fibs`

. Note carefully,
this *isn’t* a function - it’s a vector of values defined recursively 🤯

To explain the definition a little more: `scanl`

is similar to `reduce`

in that it takes a starting value, a vector, and a
binary operator, but rather than reducing the vector to a value, it creates a new
vector with the successively reduced values. In this example, the values `1..5`

are
successively added (`+`

) to `0`

, so the second entry is `0+1=1`

, the next is `1+2=3`

, the
next is `3+3=6`

, then `6+4=10`

, then `10+5=15`

λ> scanl (+) 0 [1..5] [0,1,3,6,10,15]

In the Fibonacci case, the operator is still addition, but the starting value is `1`

and the vector is … the entire vector we’re defining. Writing out some of the
terms makes this easier to understand

λ> scanl (+) 1 [0, 1, 1, 2, 3, 5, 8] [1,1,2,3,5,8,13,21]

The first terms in the sequence, after concatenating with `0`

will be

λ> [0, 1, 1, 2, 3, 5, 8]

so, by defining `fibs`

in *terms of* `fibs`

, the sequence can go on forever. So,
what if you try to print this? In `GHCI`

, the output will just stream forever, which
isn’t particularly useful. Instead, we can ask for some number of values, say, ten

λ> take 10 fibs [0,1,1,2,3,5,8,13,21,34]

Due to the laziness of Haskell, nothing is computed until it’s needed, so asking for any number of values is fast, despite the list being “infinite”

λ> take 100 fibs [0,1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,1597,2584,4181,6765,10946, 17711,28657,46368,75025,121393,196418,317811,514229,832040,1346269,2178309, 3524578,5702887,9227465,14930352,24157817,39088169,63245986,102334155, 165580141,267914296,433494437,701408733,1134903170,1836311903,2971215073, 4807526976,7778742049,12586269025,20365011074,32951280099,53316291173, 86267571272,139583862445,225851433717,365435296162,591286729879,956722026041, 1548008755920,2504730781961,4052739537881,6557470319842,10610209857723, 17167680177565,27777890035288,44945570212853,72723460248141,117669030460994, 190392490709135,308061521170129,498454011879264,806515533049393, 1304969544928657,2111485077978050,3416454622906707,5527939700884757, 8944394323791464,14472334024676221,23416728348467685,37889062373143906, 61305790721611591,99194853094755497,160500643816367088,259695496911122585, 420196140727489673,679891637638612258,1100087778366101931,1779979416004714189, 2880067194370816120,4660046610375530309,7540113804746346429,12200160415121876738, 19740274219868223167,31940434634990099905,51680708854858323072, 83621143489848422977,135301852344706746049,218922995834555169026]

The idea of taking some values from an iterator shows up in other languages.

In Rust, I can create an infinite iterator of the value `1`

use std::iter; let ones = iter::repeat(1);

and I can take some number of these, say, five, collected into a vector

ones.take(5).collect::<Vec<_>>() [1, 1, 1, 1, 1]

Python also has a notion of infinite sequences, and they’re likely even more common.

When you work with a (regular, finite) list of values in a range, you get a `range`

object

numbers = range(1, 9) numbers ## range(1, 9) type(numbers) ## <class 'range'>

You can convert this to a regular list with

list(numbers) ## [1, 2, 3, 4, 5, 6, 7, 8] type(list(numbers)) ## <class 'list'>

If you `filter`

the numbers (which works on a `range`

or a `list`

) you get a `filter`

object

even_numbers = filter(lambda x: x % 2 == 0, numbers) even_numbers ## <filter object at 0x7f5ac61da470> type(even_numbers) ## <class 'filter'>

which you can also convert to a list to see the values

list(even_numbers) ## [2, 4, 6, 8]

But, you can use this as an *iterable*, so you can get the ‘next’ value as
many times as you need (defined here again to restart the iterator)

even_numbers = filter(lambda x: x % 2 == 0, numbers) next(even_numbers) ## 2 next(even_numbers) ## 4 next(even_numbers) ## 6 next(even_numbers) ## 8

until there’s none left

next(even_numbers) ## Error in py_call_impl(callable, dots$args, dots$keywords): StopIteration

As before, you can convert these to a fixed-length list, if desired

list(filter(lambda x: x % 2 == 0, numbers)) ## [2, 4, 6, 8]

Still with me? Great. We can create an *infinite* list, if we want to, because
it isn’t evaluated until we ask for elements

def infinitenumbers(): count = 0 while True: yield count count += 1 nums = infinitenumbers() nums ## <generator object infinitenumbers at 0x7f5ac615d310> type(nums) ## <class 'generator'>

This is a *generator* which means it’s capable of generating values. We can ask
for as many as we want, now

next(nums) ## 0 next(nums) ## 1 next(nums) ## 2 next(nums) ## 3 next(nums) ## 4

If we want to convert some number of these to a list, we need a new function, roughly
the equivalent of Haskell’s `take`

, in order to extract these

from itertools import islice nums = infinitenumbers() list(islice(nums, 10)) ## [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

and as before, if we ask for more *now*, we get the next batch

list(islice(nums, 10)) ## [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

So, can we do this at all in R? We can’t build an infinite data structure *per se*,
but what if we use a recursive definition and just… stop when it’s recursed enough
times?

I initially thought about hacking into functions with `body()`

to keep track of this,
but we’ve already discussed something we can use - the stack frames! R keeps track of
how deep it’s recursed using these, and we can access that information with `sys.calls()`

,
the help for which describes

.GlobalEnv is given number 0 in the list of frames. Each subsequent function evaluation increases the frame stack by 1.

So, we can tell from inside a recursive function how deeply nested we currently are.

I wrote the following helper, named for the Haskell inspiration

take <- function(f, n, x = 1) { current_depth <- length(sys.calls()) # Subtract 1 to exclude the current call, # but add 1 to start at 1 # uncomment this line to watch the magic happen! # cat("Current Stack Depth: ", current_depth, "\n") if (current_depth >= n) { return(f(x)) } else { return(c(f(x), take(f, n, x + 1))) } }

this checks `length(sys.calls())`

which starts at 1 the first time it’s called, and adds
1 every time we go deeper. So long as we haven’t reached the requested depth, it
combines the passed-in function evaluated at \(x + i\) with a new evaluation one level deeper.

When that reaches the requested depth, it returns the evaluated function, bubbling up the returned values so that we end up with a vector of \(n\) values

\[ [f(x), f(x+1), f(x+2), f(x+3), \dots, f(x+n-1)] \]

Neat idea, but does it work? We can’t pass in an actual infinite data structure,
but we *can* pass a function that defines one

A (trivial) function that produces a number at each value of `x`

is

numbers <- function(x) { x }

If we pretend that’s a generator for every number, we can “take” some values from it

take(numbers, 5) [1] 1 2 3 4 5 take(numbers, 10) [1] 1 2 3 4 5 6 7 8 9 10

A more complicated recipe for an infinite list of numbers could be

g <- function(x) { x + 1 } take(g, 7) [1] 2 3 4 5 6 7 8 take(g, 10) [1] 2 3 4 5 6 7 8 9 10 11

Dealing with non-sequential numbers might be trickier… what if we want all the even numbers?

evens <- function(x) { if(x %% 2 == 0) x else NULL } take(evens, 10) [1] 2 4 6 8 10

No, that only checks the first 10 numbers. Instead,

evens <- function(x) { x * 2 } take(evens, 7) [1] 2 4 6 8 10 12 14 take(evens, 10) [1] 2 4 6 8 10 12 14 16 18 20

What about our original example?

fibs <- function(x) { if (x == 0) return(0) if (x == 1) return(1) fibs(x - 1) + fibs(x - 2) } # 10 values, starting at 0 take(fibs, 10, x = 0) [1] 0 1 1 2 3 5 8 13 21 34 # 10 values, starting at 1 take(fibs, 10, x = 1) [1] 1 1 2 3 5 8 13 21 34 55

Or, if we want 12 values, starting at the 10th

take(fibs, 12, x = 10) [1] 55 89 144 233 377 610 987 1597 2584 4181 6765 10946

I’d say that’s working quite nicely!!!

Some caveats to keep in mind, though…

Since we’re relying on a count of stack frames on top of `.GlobalEnv`

, this `take()`

implementation won’t work nicely inside another function. In fact, since
{knitr} is already a few functions deep, it also doesn’t work in an Rmd file (including
this blog which is Rmd via {blogdown}). Not for use in production, but a fun
exercise to figure it out at all.

Is there a better way to achieve this `take()`

functionality? Where do you use
infinite iterators/generators in R or another language? Spot an improvement that
I can make? I can be found on Mastodon or
use the comments below.

##
`devtools::session_info()`

## ─ Session info ─────────────────────────────────────────────────────────────── ## setting value ## version R version 4.1.2 (2021-11-01) ## os Pop!_OS 22.04 LTS ## system x86_64, linux-gnu ## ui X11 ## language (EN) ## collate en_AU.UTF-8 ## ctype en_AU.UTF-8 ## tz Australia/Adelaide ## date 2023-08-18 ## pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) ## ## ─ Packages ─────────────────────────────────────────────────────────────────── ## package * version date (UTC) lib source ## blogdown 1.17 2023-05-16 [1] CRAN (R 4.1.2) ## bookdown 0.29 2022-09-12 [1] CRAN (R 4.1.2) ## bslib 0.5.0 2023-06-09 [3] CRAN (R 4.3.1) ## cachem 1.0.8 2023-05-01 [3] CRAN (R 4.3.0) ## callr 3.7.3 2022-11-02 [3] CRAN (R 4.2.2) ## cli 3.6.1 2023-03-23 [3] CRAN (R 4.2.3) ## crayon 1.5.2 2022-09-29 [3] CRAN (R 4.2.1) ## devtools 2.4.5 2022-10-11 [1] CRAN (R 4.1.2) ## digest 0.6.33 2023-07-07 [3] CRAN (R 4.3.1) ## ellipsis 0.3.2 2021-04-29 [3] CRAN (R 4.1.1) ## evaluate 0.21 2023-05-05 [3] CRAN (R 4.3.0) ## fastmap 1.1.1 2023-02-24 [3] CRAN (R 4.2.2) ## fs 1.6.3 2023-07-20 [3] CRAN (R 4.3.1) ## glue 1.6.2 2022-02-24 [3] CRAN (R 4.2.0) ## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.2) ## htmltools 0.5.6 2023-08-10 [3] CRAN (R 4.3.1) ## htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.1.2) ## httpuv 1.6.6 2022-09-08 [1] CRAN (R 4.1.2) ## jquerylib 0.1.4 2021-04-26 [3] CRAN (R 4.1.2) ## jsonlite 1.8.7 2023-06-29 [3] CRAN (R 4.3.1) ## knitr 1.43 2023-05-25 [3] CRAN (R 4.3.0) ## later 1.3.0 2021-08-18 [1] CRAN (R 4.1.2) ## lattice 0.21-8 2023-04-05 [4] CRAN (R 4.3.0) ## lifecycle 1.0.3 2022-10-07 [3] CRAN (R 4.2.1) ## magrittr 2.0.3 2022-03-30 [3] CRAN (R 4.2.0) ## Matrix 1.6-0 2023-07-08 [4] CRAN (R 4.3.1) ## memoise 2.0.1 2021-11-26 [3] CRAN (R 4.2.0) ## mime 0.12 2021-09-28 [3] CRAN (R 4.2.0) ## miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2) ## pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.1.2) ## pkgload 1.3.0 2022-06-27 [1] CRAN (R 4.1.2) ## png 0.1-7 2013-12-03 [1] CRAN (R 4.1.2) ## prettyunits 1.1.1 2020-01-24 [3] CRAN (R 4.0.1) ## processx 3.8.2 2023-06-30 [3] CRAN (R 4.3.1) ## profvis 0.3.7 2020-11-02 [1] CRAN (R 4.1.2) ## promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2) ## ps 1.7.5 2023-04-18 [3] CRAN (R 4.3.0) ## purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.2) ## R6 2.5.1 2021-08-19 [3] CRAN (R 4.2.0) ## Rcpp 1.0.9 2022-07-08 [1] CRAN (R 4.1.2) ## remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.2) ## reticulate 1.26 2022-08-31 [1] CRAN (R 4.1.2) ## rlang 1.1.1 2023-04-28 [1] CRAN (R 4.1.2) ## rmarkdown 2.23 2023-07-01 [3] CRAN (R 4.3.1) ## rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.1.2) ## rstudioapi 0.15.0 2023-07-07 [3] CRAN (R 4.3.1) ## sass 0.4.7 2023-07-15 [3] CRAN (R 4.3.1) ## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2) ## shiny 1.7.2 2022-07-19 [1] CRAN (R 4.1.2) ## stringi 1.7.12 2023-01-11 [3] CRAN (R 4.2.2) ## stringr 1.5.0 2022-12-02 [1] CRAN (R 4.1.2) ## urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.1.2) ## usethis 2.1.6 2022-05-25 [1] CRAN (R 4.1.2) ## vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.1.2) ## xfun 0.40 2023-08-09 [3] CRAN (R 4.3.1) ## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.1.2) ## yaml 2.3.7 2023-01-23 [3] CRAN (R 4.2.2) ## ## [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1 ## [2] /usr/local/lib/R/site-library ## [3] /usr/lib/R/site-library ## [4] /usr/lib/R/library ## ## ─ Python configuration ─────────────────────────────────────────────────────── ## python: /usr/bin/python3 ## libpython: /usr/lib/python3.10/config-3.10-x86_64-linux-gnu/libpython3.10.so ## pythonhome: //usr://usr ## version: 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] ## numpy: /home/jono/.local/lib/python3.10/site-packages/numpy ## numpy_version: 1.24.1 ## ## NOTE: Python version was forced by RETICULATE_PYTHON_FALLBACK ## ## ──────────────────────────────────────────────────────────────────────────────

**leave a comment**for the author, please follow the link and comment on their blog:

**rstats on Irregularly Scheduled Programming**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.