Approximating RStudio Server Pro (And Shiny Server Pro, and JupyterHub) for Free

[This article was first published on Blog - R. King Data Consulting, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

RStudio Server in action!

RStudio Server in action!

Since authentication, scaling, and serving application content are all problems that can be solved with open source software, why pay $10k/year for Rstudio Server Pro or Shiny Server Pro when you can do it for free?

Well, for starters, RStudio Server Pro is seamless and really easy to use. A free and open source alternative built on top of Shinyproxy, Docker, Nginx, has a few more moving pieces and is much less elegant.

I’ll detail a way to get a multi user RStudio Server system complete with logins and mounted home directories, multiple R versions, etc. This system also can serve and scale shiny (and Dash) apps like ShinyServer Pro.

shiny.gif

So how does it work?

Shinyproxy is a free and open source alternative to shiny server pro that uses docker containers to scale and serve shiny apps. Each shiny app gets it’s own docker container. In addition, shinyproxy comes with a whole host of authentication options, from a simple hardcoded list of users and passwords (currently implimented in my github repo) to LDAP, Kerberos, Auth0, and more–all of which will work beautifully with this RStudio Server workaround.

We can exploit the fact that shinyproxy uses Docker to run really whatever app we want–not just shiny apps, but apps written in python, or even RStudio Server Open.

Docker, and by extension, shiny proxy, also offers ways to mount volumes, so we can mount a user home directory in the docker container, ensuring the user only has access to file systems we want them too, all while maintaning persistent storage across sessions.

There are a few adjustments to the configuration of Shinyproxy and the Docker file for Rstudio to get it to all work. These adjustments have all been made in the repo here, so it’s ready to go.

Mounting Home Directories

One of the difficulties in getting home directories to be useful is that docker wants the container user to have the same linux UID as whoever owns the file system that gets mounted. Using the rstudio docker container from Rocker with no edits would means the owner of the filesystem that gets mounted must have a UID of 1000. Also, that user will be named rstudio in the container, which might be a little confusing.

The workaround assumes that the home directories mounted in Rstudio or Jupyter will only ever be used for this purpose–they won’t also function as Linux user home directories. This workaround simply borrows what the fine Jupyter folks do–create hooks prior to running the container command that allow you to mount a container then switch the user inside the docker container. There’s a nice write up on how it works for Jupyter Notebook here.

To get this set up, you’ll need to create user home directories at /home/users/<shinyproxyusername> and create a new user and group with a specific UID and GID. Examples further down.

Templating Docker Commands

The shinyproxy application.yml file, where we specify the containers we use to serve apps, allows us to specify env vars, and we can further use spring expressions to parameterize these environment variables. Example below.

specs:

  - id: rstudio
    display-name: RStudio Server
    description: An Instance of RStudio Server
    container-network: sp-net
    container-cmd: ["/usr/local/bin/start.sh", "/usr/lib/rstudio-server/bin/rserver", "--server-daemonize=0", "--auth-none=1","--auth-minimum-user-id=0", "--auth-validate-users=0", "--www-frame-origin=same"]
    container-volumes: [ "/home/users/#{proxy.UserId}/:/home/#{proxy.UserId}" ]
    container-image: rstudio
    container-env:
      DISABLE_AUTH: true
      USER: root
      NB_USER: "#{proxy.UserId}"
      NB_UID: 1010
      NB_GID: 1020
      CHOWN_HOME: 'yes'
      CHOWN_HOME_OPTS: -R
    port: 8787

Here we’ve defined the spec for an Rstudio server instance to be served to a logged-in user. Notice that the container-cmd starts with the start.sh command – this is a shell script in that container that switches from the root user to the user defined by NB_USER after spinning up the container and mounting volumes but before starting Rstudio. the --auth-none=1 is becasue we need to turn off RStudio’s authentication, since shinyproxy handles that for us.

The container-volumes (and elsewhere in the application.yml file) uses templating called “spring expressions”, avaiable becasue shiny proxy is built on the springboot framework. These expressions allow us to replace the #{proxy.UserId} values with the username of the user that is logged in. The result being that the local host directory that we created earlier /home/users/[name of user] gets mounted in the container as /home/[name of user] and is available for use by the user for persistent storage.

Passing that same expression later in NB_USER casues the user in the Rstudio container to have the correct username in addition to correct permissions on the filesystem.

So How do I use it?

Easy.

1. Clone the repo

$ git clone https://github.com/rkingdc/datascience-portal

2. Build Docker Images

$ docker build -t example_shiny ./shiny
$ docker build -t example_dash ./dash
$ docker build -t rstudio ./rstudio
$ docker build -t jupyter ./jupyter

3. Add the docker_worker user

Note: if this is a multi-user system, make sure these users don’t conflict with exisitng users.

$ sudo groupadd -g 1011 docker_worker
$ sudo useradd -s /bin/false -u 1010 -g 1020 docker_worker

4. Give your users home directories

$ sudo mkdir /home/users
$ sudo mkdir /home/users/roz #this is me
$ sudo mkdir /home/users/mew #this is my cat
$ sudo chown -R docker_worker:docker_worker /home/users

5. Launch it!

$ docker-compose up

If you point your web browser to localhost:8080, you’ll see a login page. Login with one of the sets of credentials in application.yml and you’ll be able to use RStudio with the home directory your created mounted for persistent storage.

Anything else?

The github repo has examples for hosting RStudio server instances, shiny apps, Dash apps, and Jupyter Lab. The best way to see how it all works is by cloning the repo and running it all.

But don’t forget that in a production system you’ll need to adjust the nginx container to use SSL certificates so web traffic to the machine is encrypted. And if you use ShinyProxy’s kubernetes backend, you’ll likely want to encrypt traffic between containers as well–but that is left as an excercise for the reader…

To leave a comment for the author, please follow the link and comment on their blog: Blog - R. King Data Consulting.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)