Scaling Shiny Apps for Python and R: Sticky Sessions on Heroku

Posted on November 7, 2022 by Peter Solymos in R bloggers | 0 Comments

[This article was first published on R - Hosting Data Apps, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Scaling Shiny Apps for Python and R: Sticky Sessions on Heroku

Shiny for R and Python can be deployed in conventional ways, using RStudio Connect, Shiny Server Open Source, and Shinyapps.io. These hosting solutions are designed with scaling options included and are the officially recommend hosting options for Shiny apps.

When it comes to alternative options, the docs tell you to have support for WebSockets and to use sticky load balancing. WebSocket support is quite common, but load balancing with session affinity is not a trivial matter, as illustrated by this quote:

We had high hopes for Heroku, as they have a documented option for session affinity. However, for reasons we don’t yet understand, the test application consistently fails to pass. We’ll update this page as we find out more. – Shiny for Python docs

In this post and the associated video tutorial, we are going to investigate what is happening on Heroku and whether we can make sticky load balancing work.

This tutorial is a written version of the accompanying video:

Prerequisites

We will use the analythium/shiny-load-balancing GitHub repository:

You'll need Git and the Heroku CLI. We will implement scaling, which implies that at least the Standard dyno type due to the default scaling limits.

⚠️

Starting November 28, 2022, free Heroku Dynos, free Heroku Postgres, and free Heroku Data for Redis will no longer be available – see this FAQ for details.

Test applications

The Shiny for Python docs proposes a test to make sure that your deployment has sticky sessions configured. The application sends repeated requests to the server, and the test will only succeed if they connect to the same server process that the page was loaded on.

We build a Python and an R version of a test application to test how load balancing works. We use this Shiny for Python test application.

Shiny for Python

This is how the test application looks in Python (see the load-balancing/app.py file, which is written by Joe Cheng):

from shiny import *
import starlette.responses

app_ui = ui.page_fluid(
    ui.markdown(
        """
        ## Sticky load balancing test - Shiny for Python

        The purpose of this app is to determine if HTTP requests made by the client are
        correctly routed back to the same Python process where the session resides. It
        is only useful for testing deployments that load balance traffic across more
        than one Python process.

        If this test fails, it means that sticky load balancing is not working, and
        certain Shiny functionality (like file upload/download or server-side selectize)
        are likely to randomly fail.
        """
    ),
    ui.tags.div(
        {"class": "card"},
        ui.tags.div(
            {"class": "card-body font-monospace"},
            ui.tags.div("Attempts: ", ui.tags.span("0", id="count")),
            ui.tags.div("Status: ", ui.tags.span(id="status")),
            ui.output_ui("out"),
        ),
    ),
)


def server(input: Inputs, output: Outputs, session: Session):
    @output
    @render.ui
    def out():
        # Register a dynamic route for the client to try to connect to.
        # It does nothing, just the 200 status code is all that the client
        # will care about.
        url = session.dynamic_route(
            "test",
            lambda req: starlette.responses.PlainTextResponse(
                "OK", headers={"Cache-Control": "no-cache"}
            ),
        )

        # Send JS code to the client to repeatedly hit the dynamic route.
        # It will succeed if and only if we reach the correct Python
        # process.
        return ui.tags.script(
            f"""
            const url = "{url}";
            const count_el = document.getElementById("count");
            const status_el = document.getElementById("status");
            let count = 0;
            async function check_url() {{
                count_el.innerHTML = ++count;
                try {{
                    const resp = await fetch(url);
                    if (!resp.ok) {{
                        status_el.innerHTML = "Failure!";
                        return;
                    }} else {{
                        status_el.innerHTML = "In progress";
                    }}
                }} catch(e) {{
                    status_el.innerHTML = "Failure!";
                    return;
                }}

                if (count === 100) {{
                    status_el.innerHTML = "Test complete";
                    return;
                }}

                setTimeout(check_url, 10);
            }}
            check_url();
            """
        )


app = App(app_ui, server)

The UI is nothing but some text and some placeholders inside a card. The server function does two things: it defines a dynamic endpoint that the client will try to connect to repeatedly. This endpoint will send a 200 (OK) status code.

This URL is also hard-coded into a JavaScript snippet that is sent to the client. This JavaScript function is responsible for making a request to the URL 100 times. The test will succeed when the client hits the same dynamic route 100 times.

Important to not that we set the "Cache-Control" header to "no-cache" because browsers tend to cache responses from the same URL. This header helps to avoid that – caching would defeat the purpose of this test.

When load balancing is introduced, the dynamic URL will be different for each instance. If the load balancing is “sticky”, the same client will not end up on a different server process. However, if this is not the case, the test will fail and we will know.

The test application is also built with Docker so that we can deploy the same app to multiple hosting providers without too much hassle.

We use the Dockerfile.lb and the app in the load-balancing folder containing the app.py and requirements.txt files:

FROM python:3.9

RUN addgroup --system app && adduser --system --ingroup app app
WORKDIR /home/app
RUN chown app:app -R /home/app

COPY load-balancing/requirements.txt .
RUN pip install --no-cache-dir --upgrade -r requirements.txt

COPY load-balancing .

USER app
EXPOSE 8080

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]

Build, test run the image, and push to Docker Hub:

# build
docker build -f Dockerfile.lb-live -t analythium/python-shiny-lb:0.1 .

# run: open http://127.0.0.1:8080
docker run -p 8080:8080 analythium/python-shiny-lb:0.1

# push
docker push analythium/python-shiny-lb:0.1

You can either build the image yourself or pull it from Docker Hub with docker pull analythium/python-shiny-lb:0.1.

When you run the container and visit http://127.0.0.1:8080 in your browser you'll see the counter increase while the backend will log the requests:

Shinylive

Shinylive is an experimental feature (Shiny + WebAssembly) for Shiny for Python that allows applications to run entirely in a web browser, without the need for a separate server running Python. You can build the load-balancing test application as a fully static app and containerize it based on the Dockerfile.lb-live file.

The docs folder in the repository contains the exported Shinylive site with the static HTML. The app is also deployed to GitHub Pages. When users load the app from a static site, there is no need to do any kind of load balancing, because after your browser downloads the HTML and other static assets, everything happens on the client side.

Shiny for R

The R version is a port of the Python app (see the load-balancing-r/app.R file):

library(shiny)
library(bslib)

ui <- fixedPage(
    theme = bs_theme(version = 5), # force BS v5
    markdown("
## Sticky load balancing test in R-Shiny

The purpose of this app is to determine if HTTP requests made by the client are
correctly routed back to the same R process where the session resides. It
is only useful for testing deployments that load balance traffic across more
than one R process.

If this test fails, it means that sticky load balancing is not working, and
certain Shiny functionality (like file upload/download or server-side selectize)
are likely to randomly fail.
    "),
    tags$div(
        class = "card",
        tags$div(
            class = "card-body font-monospace",
            tags$div("Attempts: ", tags$span(id="count", "0")),
            tags$div("Status: ", tags$span(id="status")),
            uiOutput("out")
        )
    )
)

server <- function(input, output, session) {

    url <- session$registerDataObj(
        name   = "test",
        data   = list(),
        filter = function(data, req) {
            message("INFO: ",
                req$REMOTE_ADDR, ":",
                req$REMOTE_PORT,
                " - ",
                req$REQUEST_METHOD,
                " /session/",
                session$token,
                req$PATH_INFO,
                req$QUERY_STRING)
            shiny:::httpResponse(
                status = 200L,
                content_type = "text/html; charset=UTF-8",
                content = "OK",
                headers = list("Cache-Control" = "no-cache"))
        }
    )
    output$out <- renderUI({
        message("Incoming connection")
        tags$script(
            sprintf('
    const url = "%s";
    const count_el = document.getElementById("count");
    const status_el = document.getElementById("status");
    let count = 0;
    async function check_url() {{
        count_el.innerHTML = ++count;
        try {{
            const resp = await fetch(url);
            if (!resp.ok) {{
                status_el.innerHTML = "Failure!";
                return;
            }} else {{
                status_el.innerHTML = "In progress";
            }}
        }} catch(e) {{
            status_el.innerHTML = "Failure!";
            return;
        }}

        if (count === 100) {{
            status_el.innerHTML = "Test complete";
            return;
        }}

        setTimeout(check_url, 10);
    }}
    check_url();
            ', url)
        )
    })

}

app <- shinyApp(ui, server)

Most of the R version is the same or very similar to the Python version. The only part that might need explanation is the dynamic URL sent by the render function. This URL is set up by the session$registerDataObj() function. The filter argument takes a function as its value.

This filter function will log some information to the console so that we can see similar messages as in the Python program. The function will also respond to any request with a 200 (OK) status message using the unexported shiny:::httpResponse() function from Shiny.

We built a Docker image based on the R Shiny app as well.

Use the Dockerfile.lb-r and the app in the load-balancing-r folder containing the app.R file:

FROM eddelbuettel/r2u:22.04

RUN install.r shiny rmarkdown bslib

RUN addgroup --system app && adduser --system --ingroup app app
WORKDIR /home/app
COPY load-balancing-r .
RUN chown app:app -R /home/app
USER app

EXPOSE 8080

CMD ["R", "-e", "shiny::runApp('/home/app', port = 8080, host = '0.0.0.0')"]

Build, test run the image, and push to Docker Hub:

# build
docker build -f Dockerfile.lb-live -t analythium/r-shiny-lb:0.1 .

# run: open http://127.0.0.1:8080
docker run -p 8080:8080 analythium/r-shiny-lb:0.1

# push
docker push analythium/r-shiny-lb:0.1

You can either build the image yourself or pull it from Docker Hub with docker pull analythium/r-shiny-lb:0.1.

Visit http://127.0.0.1:8080 in your browser:

Load balancing explained

With a single instance present, all users are sent to the same server instance. No load balancing is required.

When there are multiple instances of the same app running, load balancing is needed to distribute the workload among the server processes.

The simplest load-balancing option is called round robin. Requests are sent to the instance that is next in line. When there are no more instances it starts over.

Problems with this type of load balancing can arise when the internet connection is severed for some reason, e.g. due to poor cell coverage. If saving the users' state is important for the app to work, e.g. the user uploads files etc., round robin won't be ideal.

This is when load balancing with session affinity is needed. This simply means that the load balancer keeps track of the users via some mechanism and makes sure that the same user reconnects to the same server process. The sticky mechanism can be based on the user's IP address or a cookie.

Deploying Shiny to Heroku

Install Git and the Heroku CLI and log in using heroku login. You'll be prompted to log in via your Heroku account. Our deployment will follow this guide to set up and deploy with Git.

Single instance

Once you logged in through the Heroku CLI and your browser, you can create the app named python-shiny:

# creat the app
heroku create -a python-shiny

You will notice that Heroku is added as a new Git remote, list the remotes via git remote -v:

heroku create -a python-shiny
# Creating ⬢ python-shiny... done
# https://python-shiny.herokuapp.com/ | https://git.heroku.com/python-shiny.git

git remote -v
# heroku  https://git.heroku.com/python-shiny.git (fetch)
# heroku  https://git.heroku.com/python-shiny.git (push)
# origin  ssh://[email protected]/analythium/shiny-load-balancing.git (fetch)
# origin  ssh://[email protected]/analythium/shiny-load-balancing.git (push)

We use the heroku.yml as our manifest:

build:
  docker:
    web: Dockerfile.lb
run:
  web: uvicorn app:app --host 0.0.0.0 --port $PORT

Set the stack of your app to container:

heroku stack:set container
# Setting stack to container... done

Check in your commits, then git push heroku main. This will build the image and deploy your app on Heroku.

Get the app URL from heroku info, then check the app following the info URL or from the link in your dashboard:

heroku info
# === python-shiny
# Auto Cert Mgmt: false
# Dynos:          web: 1
# Git URL:        https://git.heroku.com/python-shiny.git
# Owner:          [email protected]
# Region:         us
# Repo Size:      0 B
# Slug Size:      0 B
# Stack:          container
# Web URL:        https://python-shiny.herokuapp.com/

Scaling on Heroku

Enabling session affinity:

heroku features:enable http-session-affinity
# Enabling http-session-affinity for ⬢ python-shiny... done

Change dyno type to allow scaling to >1:

heroku ps:type web=standard-1x
# Scaling dynos on ⬢ python-shiny... done
# === Dyno Types
# type  size         qty  cost/mo
# ────  ───────────  ───  ───────
# web   Standard-1X  1    25
# === Dyno Totals
# type         total
# ───────────  ─────
# Standard-1X  1

Scale the number of web dynos to 2 or more:

heroku ps:scale web=2
# Scaling dynos... done, now running web at 2:Standard-1X

Visit the app and run the test. It should say Status: Test complete.

Notice that the logs now list a router and the web.2 instance off the test app.

Now let's disable session affinity and see if the test fails:

heroku features:disable http-session-affinity
# Disabling http-session-affinity for ⬢ python-shiny... done

Visit the app URL again, refresh, and you'll see Status: Failure! with a 404 Not Found response in the logs:

Cleanup

Delete the app from the dashboard (under Settings) or use heroku apps:destroy --confirm=python-shiny – this command will remove the git remote as well, no questions asked. Don't forget to do this if you want to avoid changes for your scaled Shiny test app.

heroku apps:destroy --confirm=python-shiny
# Destroying ⬢ python-shiny (including all add-ons)... done

git remote -v
# origin  ssh://[email protected]/analythium/shiny-load-balancing.git (fetch)
# origin  ssh://[email protected]/analythium/shiny-load-balancing.git (push)

Testing the R version of the app

For the R version, edit the heroku.yml to use Dockerfile.lb-r:

build:
  docker:
    web: Dockerfile.lb-r
run:
  web: R -e "shiny::runApp('/home/app', port = 8080, host = '0.0.0.0')"

You can now repeat the steps above with the difference of maybe renaming the app to r-shiny.

Conclusions

In this post, we deployed a containerized Shiny test application to Heroku and scaled the app to two instances. The test succeeded when session affinity was enabled, and it failed when we disabled the session affinity feature. We can conclude that scaling Shiny apps on Heroku is possible.

Subscribe to our newsletter to get notified about upcoming posts and videos about other possible ways of scaling Shiny apps via sticky load balancing.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Scaling Shiny Apps for Python and R: Sticky Sessions on Heroku

Prerequisites

Test applications

Shiny for Python

Shinylive

Shiny for R

Load balancing explained

Deploying Shiny to Heroku

Single instance

Scaling on Heroku

Cleanup

Testing the R version of the app

Conclusions

Further reading

Related

Prerequisites

Test applications

Shiny for Python

Shinylive

Shiny for R

Load balancing explained

Deploying Shiny to Heroku

Single instance

Scaling on Heroku

Cleanup

Testing the R version of the app

Conclusions

Further reading

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)