Publication-ready tables with flextable and own theme in R

[This article was first published on R | Michael Dismorr, MD, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this post, I will show you how to make publication-ready using a combination of the flextable package and a function I’ve written to customize them according to my default layout. This will minimize the need to edit in MS Word, and copy and paste from Excel to Word when exporting tables from R. In this example, I will use the tableone package to show how my customtab() function can be used to export Table 1.

The flextable package is a great package to generate word tables from R output. Together with the officer package from the same creator, it allows you to make most of the customization needed immediately in R, and thus let’s you skip this potentially annoying step in MS Word. The tableone package is a great package for, you guessed it, generating Table 1s. It allows for p-value calculations, SMDs, and generally just the output that you need.

For demonstration, I will use the gbsg dataset from the from the survival package. If you just want to try out the customtab() function, it can be downloaded here.

OK, let’s begin!

library(survival) # only needed for the dataset in this example
library(dplyr) # to modify the needed dataframe
library(tibble) # for rownames_to_column() function
library(stringr) # for str_squish()
library(flextable)
library(officer)
library(forcats)
library(tableone)
source("customtab.R")

Load the dataset:

breast <- survival::gbsg

Running ?survival::gbsg gives:

The gbsg data set contains patient records from a 1984 — 1989 trial conducted by the German Breast Cancer Study Group (GBSG) of 720 patients with node positive cancer…

Variable Explanation
pid Patiend ID
age Age in years
meno Menopausal staus - 0 = premenopaus, 1 = postmenopaus
size tumor size
grade tumor grade
nodes number of positive lymph nodes
pgr progesterone receptors (fmol/L)
er estrogen receptors (fmol/L)
hormon hormone therapy - 0 = no, 1 = yes
rfstime recurrence free survival time in days
status 0 = alive without recurrence, 1 = recurrence or death

Select variables included in Table 1.

variables <- names(breast)
breast <- breast %>% select(all_of(variables)) %>%
select(-pid, -rfstime, -status)
breast <- breast %>% mutate(meno = as_factor(meno),
grade = as_factor(grade),
hormon = as_factor(hormon))

Rename the variables as they should appear in the table. While tableone can add explanation for you, I genereally prefer to type them out myself to keep track of what I actually want to get (e.g. is it mean or median?)

breast <- breast %>% rename(`Age, years (mean (SD))` = age,
Postmenopausal = meno,
`Tumor size, mm (mean (SD)` = size,
`Tumor grade` = grade,
`Positive lymph nodes, (n)` = nodes,
`Progesterone receptors, fmol/L (median [IQR])` = pgr,
`Estrogen receptors, fmol/L (median [IQR])` = er,
`Hormone treatment` = hormon)

Relabel hormon variable to more text friendly labels

breast$`Hormone treatment` <- breast$`Hormone treatment` %>%
fct_recode(Treated = "1", Placebo = "0")

Select all variables and specify categorical variables for tableone

# All variables excluding the group variable
myVars <- breast %>% select(-`Hormone treatment`) %>% names()
# All categorical variables
catVars <- breast %>% select(where(is.factor)) %>%
dplyr::select(-`Hormone treatment`) %>% names()

Create Table 1 object

tab1 <- breast %>% CreateTableOne(vars = myVars,
data = . ,
factorVars = catVars,
strata = "Hormone treatment",
addOverall = T,
test = T)

Print Table 1 object to control output, e.g. remove missing, print non-normal variables correctly etc.

tab1_word <- print(tab1,
nonnormal = c("Progesterone receptors, fmol/L (median [IQR])",
"Estrogen receptors, fmol/L (median [IQR])"),
quote = F,
noSpaces = T,
# smd = T,
# missing = T,
test = F,
contDigits = 1,
printToggle = F,
dropEqual = T,
explain = F)

custom_tab() needs a dataframe as argument, so first we convert the tableone object

# Convert to dataframe
tab1_df <- as.data.frame(tab1_word) %>% rownames_to_column(var = "Variable")

Use custom_tab() to export the MS Word table.
Important: run customtab_defaults() before using the custom_tab() function to get correct formatting

# Rename first variable from n to No.
tab1_df$Variable[1] <- "No."
# Set Table header
header <- str_squish(str_remove("Table 1. Baseline characteristics of 686
patients enrolled in the German Breast Cancer Study Group
between 1984 and 1989", "\n"))
# Set Table footer
footer <- str_squish(str_remove("Numbers are No. (%) unless otherwise noted.
SD = standard deviation, fmol/L = femtomole per liter,
IQR = interquartile range", "\n"))
# Set custom_tab() defaults
customtab_defaults()
# Create the flextable object
flextable_1 <- custom_tab(tab1_df, header, footer)

Save the resulting table as .docx

# Save as word .docx
save_as_docx(flextable_1, path = "flextab_1.docx",
pr_section =
prop_section(page_size = page_size(orient = "portrait"),
type = "continuous"))

This will generate the following MS Word Table:

MS Word Table 1

The customtab_defaults() function simply sets some defaults for your tables, and can be changed to match the requirements of the journal you will be submitting to

customtab_defaults <- function(){
set_flextable_defaults(font.family = "Calibri",
font.size = 10,
border.color = "black")
}

The custom_tab() function looks like this

######### Create default BioAVR table from dataframe
#
# Dependencies : dplyr, flextable, officer
#
custom_tab <- function(df, header, footer){
flextable(df) %>%
add_header_lines(header) %>%
add_footer_lines(footer) %>%
bold(i = 1, part = "header") %>%
hline_top(part = "header",
border = fp_border(color = "red",
width = 3,
style = "solid")) %>%
hline(i = 1,
part = "header",
border = fp_border(color = "black",
width = 0.25,
style = "solid")) %>%
hline_top(part = "body",
border = fp_border(color = "black",
width = 0.25,
style = "solid")) %>%
hline_bottom(part = "body",
border = fp_border(color = "black",
width = 0.25,
style = "solid")) %>%
hline_bottom(part = "footer",
border = fp_border(color = "black",
width = 0.25,
style = "solid")) %>%
border_inner_h(part = "body",
border = fp_border(color = "black",
width = 0.25,
style = "dotted")) %>%
autofit(part = "body") %>%
bg(part = "body", bg = "#f5f5f5") %>%
align(part = "all", align = "center") %>%
align(j = 1, part = "all", align = "left")
}

What it does is:
1. Add header and footer
2. Make header bold
3. Adds red and black border for header
4. Adds black borders for footer
5. Adds dotted borders for body content
6. Adds gray background to body

This is a standard design I go for in my tables, but it can of course be changed to your own or journal specific requirements. While flextable allows output to other formats as well, I generally only use this for MS Word output. The reason is that I’ve noticed problems with the borders when outputting to pdf or image formats.

To leave a comment for the author, please follow the link and comment on their blog: R | Michael Dismorr, MD.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)