# Shiny: Fast Data Loading with fst

**Philipp Probst**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I had several projects where I had to load in a big dataset for my shiny app. This loading was usually done in the beginning and would take more than 3 minutes. My target was to reduce this time. I starting thinking about the problem and discovered, that not the whole dataset is required when I start the app.

## Fast and flexible data loading with **fst**

My first idea was to use a database. There is e.g. RSQLite. I also liked MonetDB a lot (much faster than RSQLite), but it was not possible to make it run on the system where it was required so I searched for alternatives.

Then I discovered the **fst** R-package.
It is a package to save datasets (data.frame/data.table) in the **fst**-format.
Data loading is reasonably fast. Moreover one can access rows and columns of the
dataset without loading the whole dataset. So it basically provides some functionalities
of a database.
The partial loading of the dataset is much faster than loading the whole dataset.

So this was the functionality I needed for speeding up the data loading process.

My strategy in the Shiny App was then the following:

- At the beginning just load the data that is really required for the dashboard at the beginning.
- Possible required data transformations (that are always required) are done in a data preparation script once beforehand.
- Meta-Data such as possible choices for inputs is saved in a separate .RData file called meta_data.RData.
- The update of this meta-data is part of the data preparation process. This meta-data is small and loaded at every start of the app.
- If some columns or rows are needed afterwards they are loaded afterwards into the app and added to the existing dataset.

All in all I was able to reduce the loading time from 3 minutes to 5 seconds by this at the starting of the app. The data loading ĺater – usually only 1 or 2 columns at once – did not have a notable performance change on the app.

## Some technical details

In the following some technical details on how I realized it in my app. The following packages are required:

library(shiny) library(fst) library(data.table)

I initiate the data as an empty reactive Value as well as
reactive Values for the **fst** file and the selected rows of the **fst** file.

data= reactiveVal(NULL) tmp_all = reactiveValues(fst = NULL, rows_fst = NULL, cols_fst = NULL)

Then I get the **fst** File that I saved beforehand with write_fst.
Note that the dataset is not loaded yet with this command.

tmp_fst = fst(my_path)

I specify the rows and columns I want to load and save them in tmp_all:

rows_fst = tmp_fst$year <= bis cols_fst = c("ID", "year", "outcome") tmp_all$rows_fst = rows_fst tmp_all$fst = tmp_fst

Then I load the dataset as data.table and save it in the reactive Value:

tmp = tmp_fst[rows_fst, select_cols] %>% setDT() data(tmp)

I wrote a function to add variables afterwards. I test first if they are already available and only add new variables:

add_variable <- function(tmp, tmp_all, new_vars) { inputs = new_vars[!(new_vars %in% colnames(tmp))] if(length(inputs) > 0) { tmp_fst = tmp_all$fst rows_fst = tmp_all$rows_fst tmp_calc = tmp_fst[rows_fst, inputs, drop = FALSE] #%>% setDT() tmp = tmp[, (inputs) := tmp_calc] } return(tmp) }

Finally the variables are added to the dataset and are now readily available for the Shiny app.

new_vars = c("the_new_variable") tmp <- add_variable(tmp = data(), tmp_all, new_vars) data_filtered(tmp)

## Feedback

Let me if you had similar problems and how you solved it. Maybe you have some ideas for improvement.

**leave a comment**for the author, please follow the link and comment on their blog:

**Philipp Probst**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.