Top Tip: Don’t keep your data prep in the same project as your Shiny app

[This article was first published on Mango Solutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Mark Sellors, Head of Data Engineering

If you use RStudio Connect to publish your Shiny app (and even if you don’t) take care with how your arrange your projects. If you have a single project that includes both your data prep and your Shiny app, packrat (which RSConnect uses to resolve package dependencies for your project) will assume the packages you used for both parts are required on the RSConnect server and will try to install them all.

This means that if your Shiny app uses three packages and your data prep uses six, packrat and RSconnect will attempt to install all nine on the server. This can be time consuming as packages are often built from source in Connect-based environments, so this will increase the deployment time considerably. Furthermore, some packages may require your server admin to resolve system-level package dependency issues, which may even be for packages that your app doesn’t use while it’s running.

Keeping data prep and your app within a single project can also confuse people who come on to your project as collaborators later in the development process, since the scope of the project will be less clear. Plus, documenting the pieces separately also helps to improve clarity.

Lastly, separating the two will make your life easier if you ever get to the stage where you want to start automating parts of your workflow as the data prep stage will already be separate from the rest of the project.

Clear separation of individual projects (and by extension, source code repositories) may cause some short term pain, but the long term benefits are hard to understate:

  • Smoother and faster RStudio Connect deployments
  • Easier collaboration
  • More straightforward automation (easier to build out into a pipeline)
  • Simpler to document – one set for the app, another for your data prep

Of course, if your Shiny app actually does data prep as part of the apps internal processing, then all bets are off!

To leave a comment for the author, please follow the link and comment on their blog: Mango Solutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)