Changing names in the tidyverse: An example for many regressions

Posted on March 8, 2017 by R on Abhijit Dasgupta in R bloggers | 0 Comments

[This article was first published on R on Abhijit Dasgupta, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A collaborator posed an interesting R question to me today. She wanted to do several regressions using different outcomes, with models being computed on different strata defined by a combination of experimental design variables. She then just wanted to extract the p-values for the slopes for each of the models, and then filter the strata based on p-value levels.

This seems straighforward, right? Let’s set up a toy example:

library(tidyverse)

dat <- as_tibble(expand.grid(letters[1:4], 1:5))
d <- vector('list', nrow(dat))
set.seed(102)
for(i in 1:nrow(dat)){
  x <- rnorm(100)
  d[[i]] <- tibble(x = x, y1 = 3 - 2*x + rnorm(100), y2 = -4+5*x+rnorm(100))
}
dat <- as_tibble(bind_cols(dat, tibble(dat=d))) %>% unnest()
knitr::kable(head(dat), digits=3, format='html')

Var1	Var2	x	y1	y2
a	1	0.181	4.260	-3.005
a	1	0.785	0.002	-2.105
a	1	-1.353	3.171	-9.157
a	1	1.983	-0.714	5.966
a	1	1.238	0.352	2.131
a	1	1.201	0.627	1.752

Now we’re going to perform two regressions, one using y1 and one using y2 as the dependent variables, for each stratum defined by Var1 and Var2.

out <- dat %>%
  nest(-Var1, -Var2) %>%
  mutate(model1 = map(data, ~lm(y1~x, data=.)),
         model2 = map(data, ~lm(y2~x, data=.)))

Now conceptually, all we do is tidy up the output for the models using the broom package, filter on the rows containg the slope information, and extract the p-values, right? Not quite….

library(broom)
out_problem <- out %>% mutate(output1 = map(model1, ~tidy(.)),
                              output2 = map(model2, ~tidy(.))) %>%
  select(-data, -model1, -model2) %>%
  unnest()
names(out_problem)
##  [1] "Var1"       "Var2"       "term"       "estimate"   "std.error" 
##  [6] "statistic"  "p.value"    "term1"      "estimate1"  "std.error1"
## [11] "statistic1" "p.value1"

We’ve got two sets of output, but with the same column names!!! This is a problem! An easy solution would be to preface the column names with the name of the response variable. I struggled with this today until I discovered the secret function.

out_nice <- out %>% mutate(output1 = map(model1, ~tidy(.)),
                           output2 = map(model2, ~tidy(.)),
                           output1 = map(output1, ~setNames(., paste('y1', names(.), sep='_'))),
                           output2 = map(output2, ~setNames(., paste('y2', names(.), sep='_')))) %>%
  select(-data, -model1, -model2) %>%
  unnest()

This is a compact representation of the results of both regressions by strata, and we can extract the information we would like very easily. For example, to extract the stratum-specific slope estimates:

out_nice %>% filter(y1_term=='x') %>%
  select(Var1, Var2, ends_with('estimate')) %>%
  knitr::kable(digits=3, format='html')

Var1	Var2	y1_estimate	y2_estimate
a	1	-1.897	5.036
b	1	-2.000	5.022
c	1	-1.988	4.888
d	1	-2.089	5.089
a	2	-2.052	5.015
b	2	-1.922	5.004
c	2	-1.936	4.969
d	2	-1.961	4.959
a	3	-2.043	5.017
b	3	-2.045	4.860
c	3	-1.996	5.009
d	3	-1.922	4.894
a	4	-2.000	4.942
b	4	-2.000	4.932
c	4	-2.033	5.042
d	4	-2.165	5.049
a	5	-2.094	5.010
b	5	-1.961	5.122
c	5	-2.106	5.153
d	5	-1.974	5.009

To leave a comment for the author, please follow the link and comment on their blog: R on Abhijit Dasgupta.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Changing names in the tidyverse: An example for many regressions

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)