Creating List with Iterator

[This article was first published on S+/R – Yet Another Blog in Statistical Computing, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the post (https://statcompute.wordpress.com/2018/11/17/growing-list-vs-growing-queue), it is shown how to grow a list or a list-like queue based upon a dataframe. In the example, the code snippet was heavily relied on the FOR loop to do the assignment item by item, which I can’t help thinking of potential alternatives afterwards. For instance, is there an implementation that would enable us to traverse a dataframe without knowing its dimension in advance or even without using the loop?

In the previous example, if we’d want to fetch rows from a dataframe, we need to know the number of rows in advance by using the nrow() function. As shown below, we need to generate a sequence of row index and then to fetch rows by indexing,


lapply(seq(nrow(iris)), function(idx) as.list(iris[idx, ]))

If we don’t like to fetch rows from a dataframe by indexing, a workaround would be the split() function by splitting the dataframe into rows. The additional unname() function is doing nothing but removing redundant list names. However, we still need to know the number of rows in this solution.


unname(lapply(split(iris, seq(nrow(iris))), function(row) as.list(row)))

With the iterators package, the coding logic can be slightly cleaner and more generic by wrapping the dataframe into a row-wise iterator object, as demonstrated below.


lapply(iterators::iter(iris, by = 'row'), function(row) as.list(row))

In addition, the iterator object is customizable. For instance, we can easily apply a filter function to the iterator.


lapply(iterators::iter(iris, by = 'row', checkFunc = function(x) x$Species == "setosa" & x$Petal.Width > 0.4), function(x) as.list(x))

If the use case is not creating a list, as discussed above, but growing an empty list by inserting, then a simple iterator might not be sufficient. In such case, we might need to tweak it a little by enumerating the iterator with the ienum() function in the itertools2 package. Alternatively, we can also use itertools2::izip() function to construct the enumeration manually. It is noted that, because we need to assign values with a function call within the lapply() to a list in the parent environment, the scoping assignment should be used.


with(l1 <- list(), 
     invisible(lapply(itertools2::ienum(iterators::iter(iris, by = 'row')), function(x) l1[[x$index]] <<- as.list(x$value))))

### CHECK THE EQUALITY ###
identical(l1, lapply(iterators::iter(iris, by = 'row'), function(row) as.list(row)))
# TRUE

with(l2 <- list(), 
     invisible(lapply(itertools2::izip(i = itertools2::icount(start = 1), v = iterators::iter(iris, by = 'row')), function(x) l2[[x$i]] <<- as.list(x$v))))

### CHECK THE EQUALITY ###
identical(l2, lapply(iterators::iter(iris, by = 'row'), function(row) as.list(row)))
# TRUE

To leave a comment for the author, please follow the link and comment on their blog: S+/R – Yet Another Blog in Statistical Computing.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)