Experiments with by_row()

[This article was first published on R – Jocelyn Ireson-Paine's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve been experimenting with by_row() from the R purrrlyr package. It’s a function that maps over the rows of a data frame, which I thought might be handy for processing our economic model’s parameter files. I didn’t find the documentation for by_row() told me everything I wanted to know, so I made up and ran some examples. As these might be useful to others, I’m blogging them here.

purrrlyr was written mainly by Hadley Wickham. Regular readers of my posts will know that Hadley wrote the suite of programs known as the Tidyverse, which can be installed with the command install.packages("tidyverse"). However, purrrlyr is not quite part of this, and has to be installed separately, via install.packages("purrrlyr").

Why am I interested in by_rows()? Our economic model reads what we call parameter files. These are spreadsheets describing the taxes and benefits it has to apply, plus various controls. Each spreadsheet contains several worksheets, most of which start with three columns making up a hierarchical parameter name, followed by one or more values. Here’s a fragment from one such sheet:

        Name1              Name2    Name3 Allowance_Spouse PrivatePension BankAccount NationalSavings
1 IncomeLists      IncomeSupport                     1              1           0               0               
2 IncomeLists      PensionCredit     Main                1              1           0               0               
3 IncomeLists      PensionCredit  Savings                0              1           0               0               
4 IncomeLists NonContributoryESA   Income                1              1           0               0               
5 IncomeLists NonContributoryESA Earnings                0              0           0               0               
6 IncomeLists              HBCTB                     1              0           0               0               

And here’s a screen shot of it:

I need to process these sheets a row at a time, converting each row into an entry in a lookup table that maps parameter names to their values. Because by_row() maps over rows, it looked worth learning about. So the code below shows the calls I tried. In it, t is the tibble I passed to by_rows(), and tbr is the result of the calls. I’ve commented each call explaining what it tells me, and also listed the things I wanted to know. One was what the documentation meant by saying that the .collate argument makes by_row() collate by row or by column. “Collate” is a very general verb, and didn’t tell me exactly how by_row()‘s results were arranged. I was also puzzled by the .labels argument, in which if TRUE “the returned data frame is prepended with the labels of the slices (the
columns in .d used to define the slices)”.

# try_by_row.R
# Some experiments with 
# purrrlyr's by_row() function.
# This may be useful when 
# converting segments of tibble
# rows to lists that I can
# store in a single cell.

library( tidyverse )
library( purrrlyr )


To leave a comment for the author, please follow the link and comment on their blog: R – Jocelyn Ireson-Paine's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)