# New formatting features in the parameters package

**R on easystats**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

You probably already have heard of the *parameters* package, a light-weight package to **extract, compute and explore the parameters of statistical models** using R (if not, there is a related publication introducing the package’s main features).

In this post, we like to introduce a new feature that facilitates nicely rendered output in **markdown** or **HTML** format (including **PDFs**). This allows you to easily create pretty tables of model summaries, for a large variety of models.

The *parameters* package, together with the *insight* package, provides those tools to format the layout and style of tables from model parameters. The easy way is using the `model_parameters()`

function, where usually don’t have to take care about formatting and layout, at least not for simple purposes like printing to the console or inside rmarkdown documents. However, sometimes you may want to do the formatting steps manually. This blog post introduces the various functions that are used for parameters table formatting.

## An Example Model

We start with a model that does not make much sense, but it is useful for demonstrating the formatting functions.

data(iris) iris$Petlen <- cut(iris$Petal.Length, breaks = c(0, 3, 7)) model <- lm(Sepal.Width ~ poly(Sepal.Length, 2) + Species + Petlen, data = iris) summary(model) ## ## Call: ## lm(formula = Sepal.Width ~ poly(Sepal.Length, 2) + Species + ## Petlen, data = iris) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.7742 -0.1490 -0.0056 0.1666 0.6973 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 3.8127 0.0582 65.50 < 2e-16 *** ## poly(Sepal.Length, 2)1 4.0602 0.4668 8.70 7e-15 *** ## poly(Sepal.Length, 2)2 -1.3024 0.3149 -4.14 6e-05 *** ## Speciesversicolor -1.0056 0.2781 -3.62 0.00041 *** ## Speciesvirginica -0.9913 0.2851 -3.48 0.00067 *** ## Petlen(3,7] -0.1360 0.2818 -0.48 0.63019 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.28 on 144 degrees of freedom ## Multiple R-squared: 0.615, Adjusted R-squared: 0.602 ## F-statistic: 46 on 5 and 144 DF, p-value: <2e-16

## Formatting Parameter Names

As we can see, in such cases, the standard R output looks a bit cryptic, although all necessary and important information is included in the summary. The formatting of coefficients for polynomial transformation is difficult to read, factors grouped with `cut()`

always require a short time of thinking to find out which of the bound (in this case, `Petlen(3,7]`

, 3 and 7) is included in the range, and names of factor levels are directly concatenated to the name of the factor variable.

Thus, the first step would be to format the parameter names, which can be done with `format_parameters()`

from the *parameters* package:

library(parameters) format_parameters(model) ## (Intercept) poly(Sepal.Length, 2)1 ## "(Intercept)" "Sepal.Length [1st degree]" ## poly(Sepal.Length, 2)2 Speciesversicolor ## "Sepal.Length [2nd degree]" "Species [versicolor]" ## Speciesvirginica Petlen(3,7] ## "Species [virginica]" "Petlen [4-7]"

`format_parameters()`

returns a (named) character vector with the original coefficients as *names* of each character element, and the formatted names of the coefficients as values of the character vector. Let’s look at the results again:

cat(format_parameters(model), sep = "\n") ## (Intercept) ## Sepal.Length [1st degree] ## Sepal.Length [2nd degree] ## Species [versicolor] ## Species [virginica] ## Petlen [4-7]

Now variable names and factor levels, but also polynomial terms or even factors grouped with `cut()`

are much more readable. Factor levels are separated from the variable name, inside brackets. Same for the coefficients of the different polynomial degrees. And the exact range for `cut()`

-factors is also clearer now.

## Standardizing Column Names of Parameter Tables

As seen above, the `summary()`

returns columns named `Estimate`

, `t value`

or `Pr(>|t|)`

. While `Estimate`

is not specific for certain models, `t value`

is. For logistic regression models, you would get `z value`

. Some packages alter the names, so you get just `t`

or `t-value`

etc.

`model_parameters()`

also uses context-specific column names, where applicable:

colnames(model_parameters(model)) ## [1] "Parameter" "Coefficient" "SE" "CI_low" "CI_high" ## [6] "t" "df_error" "p"

For Bayesian models, `Coefficient`

is usually named `Median`

etc. While this makes sense from a user perspective, because you instantly know which type of statistic or coefficient you have, it becomes difficult when you need a generic naming scheme to access model parameters when the input model is unknown. This is the typical approach from the *broom* package, where you get “standardized” column names:

library(broom) colnames(tidy(model)) ## [1] "term" "estimate" "std.error" "statistic" "p.value"

To deal with such situations, the *insight* package provides a `standardize_names()`

function, which exactly does that: standardizing the column names of the input. In the following example, you see that the statistic-column is no longer named `t`

, but `statistic`

. `df_error`

or `df_residuals`

will be renamed to `df`

.

library(insight) library(magrittr) model %>% model_parameters() %>% standardize_names() %>% colnames() ## [1] "Parameter" "Coefficient" "SE" "CI_low" "CI_high" ## [6] "Statistic" "df" "p"

Furthermore, you can request “broom”-style for column names:

model %>% model_parameters() %>% standardize_names(style = "broom") %>% colnames() ## [1] "term" "estimate" "std.error" "conf.low" "conf.high" "statistic" ## [7] "df.error" "p.value"

## Formatting Column Names and Columns

Beside formatting parameter names (coefficient names) using `format_parameters()`

, we can do even more to make the output more readable. Let’s look at an example that includes confidence intervals.

cbind(summary(model)$coefficients, confint(model)) ## Estimate Std. Error t value Pr(>|t|) 2.5 % 97.5 % ## (Intercept) 3.81 0.058 65.50 4.6e-109 3.70 3.93 ## poly(Sepal.Length, 2)1 4.06 0.467 8.70 7.0e-15 3.14 4.98 ## poly(Sepal.Length, 2)2 -1.30 0.315 -4.14 6.0e-05 -1.92 -0.68 ## Speciesversicolor -1.01 0.278 -3.62 4.1e-04 -1.56 -0.46 ## Speciesvirginica -0.99 0.285 -3.48 6.7e-04 -1.55 -0.43 ## Petlen(3,7] -0.14 0.282 -0.48 6.3e-01 -0.69 0.42

We can get a similar tabular output using *broom*.

tidy(model, conf.int = TRUE) ## # A tibble: 6 x 7 ## term estimate std.error statistic p.value conf.low conf.high #### 1 (Intercept) 3.81 0.0582 65.5 4.61e-109 3.70 3.93 ## 2 poly(Sepal.Length, ~ 4.06 0.467 8.70 7.00e- 15 3.14 4.98 ## 3 poly(Sepal.Length, ~ -1.30 0.315 -4.14 5.98e- 5 -1.92 -0.680 ## 4 Speciesversicolor -1.01 0.278 -3.62 4.12e- 4 -1.56 -0.456 ## 5 Speciesvirginica -0.991 0.285 -3.48 6.72e- 4 -1.55 -0.428 ## 6 Petlen(3,7] -0.136 0.282 -0.482 6.30e- 1 -0.693 0.421

Some improvements according to readability could be collapsing and formatting the confidence intervals, and maybe the p-values. This would require some effort, for instance, to format the values of the lower and upper confidence intervals and collapsing them into one column. However, the `format_table()`

function is a convenient function that does all the work for you.

`format_table()`

requires a data frame with model parameters as input, however, there are some requirements to make `format_table()`

work. In particular, the column names must follow a certain pattern to be recognized, and this pattern may either be the naming convention from *broom* or the *easystats* packages.

model %>% tidy(conf.int = TRUE) %>% format_table() ## term estimate std.error statistic p.value conf.int ## 1 (Intercept) 3.81 0.06 65.50 < .001 [ 3.70, 3.93] ## 2 poly(Sepal.Length, 2)1 4.06 0.47 8.70 < .001 [ 3.14, 4.98] ## 3 poly(Sepal.Length, 2)2 -1.30 0.31 -4.14 < .001 [-1.92, -0.68] ## 4 Speciesversicolor -1.01 0.28 -3.62 < .001 [-1.56, -0.46] ## 5 Speciesvirginica -0.99 0.29 -3.48 < .001 [-1.55, -0.43] ## 6 Petlen(3,7] -0.14 0.28 -0.48 0.630 [-0.69, 0.42]

When the parameters table also includes degrees of freedom, and the degrees of freedom are the same for each parameter, then this information is included in the statistic-column. This is usually the default for `model_parameters()`

:

model %>% model_parameters() %>% format_table() ## Parameter Coefficient SE 95% CI t(144) p ## 1 (Intercept) 3.81 0.06 [ 3.70, 3.93] 65.50 < .001 ## 2 Sepal.Length [1st degree] 4.06 0.47 [ 3.14, 4.98] 8.70 < .001 ## 3 Sepal.Length [2nd degree] -1.30 0.31 [-1.92, -0.68] -4.14 < .001 ## 4 Species [versicolor] -1.01 0.28 [-1.56, -0.46] -3.62 < .001 ## 5 Species [virginica] -0.99 0.29 [-1.55, -0.43] -3.48 < .001 ## 6 Petlen [4-7] -0.14 0.28 [-0.69, 0.42] -0.48 0.630

## Exporting the Parameters Table

Finally, `export_table()`

from *insight* formats the data frame and returns a character vector that can be printed to the console or inside rmarkdown documents. The data frame then looks more “table-like”.

data(mtcars) cat(export_table(mtcars[1:8, 1:5])) ## mpg | cyl | disp | hp | drat ## --------------------------------- ## 21.00 | 6 | 160.00 | 110 | 3.90 ## 21.00 | 6 | 160.00 | 110 | 3.90 ## 22.80 | 4 | 108.00 | 93 | 3.85 ## 21.40 | 6 | 258.00 | 110 | 3.08 ## 18.70 | 8 | 360.00 | 175 | 3.15 ## 18.10 | 6 | 225.00 | 105 | 2.76 ## 14.30 | 8 | 360.00 | 245 | 3.21 ## 24.40 | 4 | 146.70 | 62 | 3.69

Putting all this together allows us to create nice tabular outputs of parameters tables. This can be done using *broom*:

model %>% tidy(conf.int = TRUE) %>% format_table() %>% export_table() %>% cat() ## term | estimate | std.error | statistic | p.value | conf.int ## ------------------------------------------------------------------------------------ ## (Intercept) | 3.81 | 0.06 | 65.50 | < .001 | [ 3.70, 3.93] ## poly(Sepal.Length, 2)1 | 4.06 | 0.47 | 8.70 | < .001 | [ 3.14, 4.98] ## poly(Sepal.Length, 2)2 | -1.30 | 0.31 | -4.14 | < .001 | [-1.92, -0.68] ## Speciesversicolor | -1.01 | 0.28 | -3.62 | < .001 | [-1.56, -0.46] ## Speciesvirginica | -0.99 | 0.29 | -3.48 | < .001 | [-1.55, -0.43] ## Petlen(3,7] | -0.14 | 0.28 | -0.48 | 0.630 | [-0.69, 0.42]

Or, in a simpler way and with much more options (like standardizing, robust standard errors, bootstrapping, …) using `model_parameters()`

, which `print()`

-method does all these steps automatically:

model_parameters(model) ## Parameter | Coefficient | SE | 95% CI | t(144) | p ## --------------------------------------------------------------------------------- ## (Intercept) | 3.81 | 0.06 | [ 3.70, 3.93] | 65.50 | < .001 ## Sepal.Length [1st degree] | 4.06 | 0.47 | [ 3.14, 4.98] | 8.70 | < .001 ## Sepal.Length [2nd degree] | -1.30 | 0.31 | [-1.92, -0.68] | -4.14 | < .001 ## Species [versicolor] | -1.01 | 0.28 | [-1.56, -0.46] | -3.62 | < .001 ## Species [virginica] | -0.99 | 0.29 | [-1.55, -0.43] | -3.48 | < .001 ## Petlen [4-7] | -0.14 | 0.28 | [-0.69, 0.42] | -0.48 | 0.630

## Formatting the Parameters Table in Markdown

`export_table()`

provides a few options to generate tables in markdown-format. This allows to easily render nice-looking tables inside markdown-documents. First of all, use `format = "markdown"`

to activate the markdown-formatting. `caption`

can be used to add a table caption. Furthermore, `align`

allows to choose an alignment for all table columns, or to specify the alignment for each column individually.

The following table has six columns. Using `align = "lcccrr"`

would left-align the first column, center columns two to four, and right-align the last two columns.

model %>% tidy(conf.int = TRUE) %>% # parenthesis look better in markdown-tables, so we use "brackets" here format_table(ci_brackets = c("(", ")")) %>% export_table(format = "markdown", caption = "My Table", align = "lcccrr")

term | estimate | std.error | statistic | p.value | conf.int |
---|---|---|---|---|---|

(Intercept) | 3.81 | 0.06 | 65.50 | < .001 | ( 3.70, 3.93) |

poly(Sepal.Length, 2)1 | 4.06 | 0.47 | 8.70 | < .001 | ( 3.14, 4.98) |

poly(Sepal.Length, 2)2 | -1.30 | 0.31 | -4.14 | < .001 | (-1.92, -0.68) |

Speciesversicolor | -1.01 | 0.28 | -3.62 | < .001 | (-1.56, -0.46) |

Speciesvirginica | -0.99 | 0.29 | -3.48 | < .001 | (-1.55, -0.43) |

Petlen(3,7] | -0.14 | 0.28 | -0.48 | 0.630 | (-0.69, 0.42) |

`print_md()`

is a convenient wrapper around `format_table()`

and `export_table(format = "markdown")`

, and allows to directly format the output of functions like `model_parameters()`

, `simulate_parameters()`

or other *parameters* functions in markdown-format.

These tables are also nicely formatted when knitting markdown-documents into Word or PDF. `print_md()`

applies some default settings that have proven to work well for markdown, PDF or Word tables.

model_parameters(model) %>% print_md()

Parameter | Coefficient | SE | 95% CI | t(144) | p |
---|---|---|---|---|---|

(Intercept) | 3.81 | 0.06 | (3.70, 3.93) | 65.50 | < .001 |

Sepal.Length (1st degree) | 4.06 | 0.47 | (3.14, 4.98) | 8.70 | < .001 |

Sepal.Length (2nd degree) | -1.30 | 0.31 | (-1.92, -0.68) | -4.14 | < .001 |

Species (versicolor) | -1.01 | 0.28 | (-1.56, -0.46) | -3.62 | < .001 |

Species (virginica) | -0.99 | 0.29 | (-1.55, -0.43) | -3.48 | < .001 |

Petlen (4-7) | -0.14 | 0.28 | (-0.69, 0.42) | -0.48 | 0.630 |

A similar option is `print_html()`

, which is a convenient wrapper for `format_table()`

and `export_table(format = "html")`

. Using HTML in markdown has the advantage that it will be properly rendered when exporting to PDF.

model_parameters(model) %>% print_html()

Regression Model | |||||
---|---|---|---|---|---|

Parameter | Coefficient | SE | 95% CI | t(144) | p |

(Intercept) | 3.81 | 0.06 | (3.70, 3.93) | 65.50 | < .001 |

Sepal.Length (1st degree) | 4.06 | 0.47 | (3.14, 4.98) | 8.70 | < .001 |

Sepal.Length (2nd degree) | -1.30 | 0.31 | (-1.92, -0.68) | -4.14 | < .001 |

Species (versicolor) | -1.01 | 0.28 | (-1.56, -0.46) | -3.62 | < .001 |

Species (virginica) | -0.99 | 0.29 | (-1.55, -0.43) | -3.48 | < .001 |

Petlen (4-7) | -0.14 | 0.28 | (-0.69, 0.42) | -0.48 | 0.630 |

`print_md()`

and `print_html()`

are considered as main functions for users who want to generate nicely rendered tables inside markdown-documents. A wrapper around these both is `display()`

, which either calls `print_md()`

or `print_html()`

.

model_parameters(model) %>% display(format = "html")

Regression Model | |||||
---|---|---|---|---|---|

Parameter | Coefficient | SE | 95% CI | t(144) | p |

(Intercept) | 3.81 | 0.06 | (3.70, 3.93) | 65.50 | < .001 |

Sepal.Length (1st degree) | 4.06 | 0.47 | (3.14, 4.98) | 8.70 | < .001 |

Sepal.Length (2nd degree) | -1.30 | 0.31 | (-1.92, -0.68) | -4.14 | < .001 |

Species (versicolor) | -1.01 | 0.28 | (-1.56, -0.46) | -3.62 | < .001 |

Species (virginica) | -0.99 | 0.29 | (-1.55, -0.43) | -3.48 | < .001 |

Petlen (4-7) | -0.14 | 0.28 | (-0.69, 0.42) | -0.48 | 0.630 |

## Get Involved

*easystats* is a new project in active development, looking for contributors and supporters. Thus, do not hesitate to contact us if **you want to get involved 🙂**

**Check out our other blog posts**!*here*

## Stay tuned

To be updated about the *upcoming features* and cool R or data science stuff, you can **follow the packages on GitHub** (click on one of the easystats package) and then on the **Watch** button on the top right corner) as well as the **easystats team on twitter and online**:

**leave a comment**for the author, please follow the link and comment on their blog:

**R on easystats**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.