Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture.

[This article was first published on r – Appsilon Data Science | End­ to­ End Data Science Solutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Shiny is a great tool for fast prototyping. When a data science team creates a Shiny app, sometimes it becomes very popular. From that point this app becomes a tool used on production by many people, that should be reliable and work fast for many concurrent users. There are many ways to optimize a Shiny app like using promises for non-blocking access, profvis for finding bottlenecks, and shiny modules for well-structured code, etc. Regardless of this, the main aspect you should focus on how you serve the application.

Why is scaling Shiny tricky?

R is single-threaded

This means all users connected to one R process will block each other. Multithreading allows for application responsiveness, and by design, R doesn’t. You have to use workarounds to provide it.

For example, you can do this by serving multiple instances of the app. Also, a promises package can be used to improve responsiveness, but it is still fundamentally different than promises in JavaScript. JavaScript promises are different thanks to the event loop mechanism, which features fully asynchronous I/O and worker threads.

R is slow. It was designed for convenient data analysis, not for webapps.

R language was created to make data analysis faster. It proved its power, and that’s why it is the tool of choice for many data scientists. However, faster analysis doesn’t mean better performance.

Data analysis in R is fast because of convenient syntax and the amazing amount of useful statistical packages, but the language execution time itself is slow. A detailed explanation of this is covered by Hadley Wickham in this article about R performance. There are also many benchmarks on what the speed of R is.

Personally, I am excited by alternative implementations of R language, which aim to improve performance. One of them is Oracle FastR based on GraalVM, and another is JVM-based Renjin. Currently, they are still evolving, but I believe their time will come soon.

Four ways how you can serve Shiny

Many people ask us what the effective options are for serving a Shiny app, especially for a large number of users. The industry standard tool for enterprises is RStudio Shiny Server Pro and it costs $9,995/year for 20 concurrent users. It is mature, powerful solution, which we often recommend to our clients.

However, this is not the only possibility.

Before you make a decision on how to scale your app to multiple users, it is important to understand what your needs are:

  • What is the app initialization time?
  • Does it load a lot of data into memory on startup?
  • What is the app complexity?
  • How heavy are the calculations?
  • What is the expected behavior of users?

Answering these questions will help you choose the best fit for your use case.

Now, let’s see what options are available to you:

(1) Shiny Server Open Source

Shiny Server Open Source is limited to one R process per app, which potentially can serve multiple user sessions (connections to the app). This is totally fine for apps that don’t have many users, but it doesn’t work well for apps that will have large amounts of users. When you have many users, they all will be served by one process and will inevitably block each other.

This architecture is visualized by the following diagram:

Shiny Server Open Source scaling

(2) Shiny Server Pro

In contrast, on Shiny Server Pro, you can have multiple R processes per app. This means that many concurrent users can be distributed between separate processes and are served more efficiently. As there is no limitation on the number of processes, you can make use of all your machine resources.

On Shiny Server Pro, you can configure a strategy on how resources should be handled with `utilization_scheduler` parameter.

For example, you can set:

  • The maximum R process capacity, i.e. the number of concurrent users per single R process;
  • The maximum number of R processes per single app;
  • When the server should spawn a new R process, e.g. when existing processes reach 90% of their capacity.

Below you can see an example situation:
shiny server pro scaling

In this scenario, five users want to access the app. Shiny Server Pro initializes three worker processes, and users are distributed between them.

(3) ShinyProxy

ShinyProxy from OpenAnalytics is an open source alternative for serving Shiny apps. It provides many features required by enterprise applications. Its architecture is based on docker containers, which isolate the app’s environment and can serve not only Shiny apps but other apps, too (like Dash).

The key difference is that ShinyProxy starts a new app instance for each new user. The architecture is straightforward, convenient, and easy to maintain. This is how it looks on a diagram:

ShinyProxy scaling

(4) Custom architecture like Appsilon Revolver

Sometimes deploying a Shiny app requires a custom solution. Let’s say we have a Shiny app that takes 60 seconds to initialize, and that time amount is deemed as an unacceptable wait time for a user. In this situation, starting an app instance for each user is not the way to go.

At Appsilon, we create custom architectures for Shiny apps. In many cases, we recommend Shiny Server or ShinyProxy because they are often the right tool for a given task. However, there are exceptions.

One of our custom architectures is a project code-named “Revolver.” We wanted to create a scalable architecture for a Shiny app that takes a long time to initialize and performs a lot of heavy computations. It is not a standalone product, and it was first designed for one of our clients. Since then, we’ve used it to deploy several production apps.

Here is how it looks:

Revolver scaling shiny

 

This approach uses docker containers that serve the application using Shiny Server Open Source. In front of application instances, there is a load balancer that distributes the traffic.

The difference between our product and ShinyProxy is that there are N pre-initialized containers that wait for user connections. The number of containers is configured by the app admin and can be auto-adjusted. The advantage of this approach is that the app is served instantly. Users do not have to wait for the app initialization.

It is also worth noting that this custom architecture supports SSL connection, authentication, and other enterprise requirements.

Cheat Sheet

I hope this article on the possible solutions for scaling Shiny apps to many concurrent users was helpful to you. If you’d like to chat about a specific use case, you can contact us through https://appsilon.com/shiny

Below, we have compiled the information shared above into a comparison cheat sheet diagram. Again, I hope it helps!

Scaling Shiny

Article Alternative approaches to scaling Shiny with RStudio Shiny Server, ShinyProxy or custom architecture. comes from Appsilon Data Science | End­ to­ End Data Science Solutions.

To leave a comment for the author, please follow the link and comment on their blog: r – Appsilon Data Science | End­ to­ End Data Science Solutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)