Shiny Server on AWS
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Shiny web framework for R is great, and one of my most frequently used packages. I’ve used it to develop exploratory data analysis and visualization tools for my coworkers. When I first started developing these apps I would send instructions to my coworkers explaining how to install R, RStudio, the packages they needed, and how to run the app from GitHub. It was easy enough, albeit very tedious. Most of the first users were beta-testers checking the functionality and providing feature requests. Most importantly all of these users were in the same office as me which allowed them to grab the data off of our server (in-house). I have several colleagues that work in remote offices, and some of these offices aren’t networked to these servers. I wouldn’t be surprised if this was the case for many small state agencies in rural states. The sensitive nature of some of the data requires it be protected. Another potential issue. To solve these issues I’ve turned to Amazon Web Services (AWS) to deploy Shiny apps and host our data on the cloud.
A few caveats. I am not a SysAdmin, DevOps, or trained in any other computer science related fields. I work for a small state agency (Department of Wildlife) and we don’t have people with those skills (unfortunately). I’ve learned all with the help of several online resource (I’ll link to those). Depending on the sensitivity of your data and methods there may be easier options. AWS isn’t your only option, Digital Ocean, Heroku, Google, Microsoft all have similar offerings. Your mileage may vary. As always, any input is greatly appreciated.
AWS
AWS is a huge offering of 55 (at least) services to manage, store and run the cloud. I started using AWS at the recommendation of my supervisor, he host a few ESRI related products on AWS. Getting started is complicated, however AWS is extremely well documented and is as intuitive as possible.
This post will cover how to set up an EC2 instance from scratch, install R, Shiny Server, and other helpful libraries and packages. I relied on Dean Attali’s post about deploying RStudio and ShinySever on Digital Ocean. AWS has a 1 year free tier for many of their products. Sign up for an account to get started. Warning: I am using a Mac, any command line stuff might be different for Windows users. I’ll link to resources when I can find them.
Regions
You can launch many AWS services in different regions. Hosting apps or databases in multiple regions is a feature of AWS that helps with latency, fault tolerance and scaling. AWS defaults to your nearest region. I recommend you stick with the default region. If you can’t find any of your AWS instances of a service you’ve launched, check that you’re in the correct region. Some regions are more expensive than others.
Elastic Compute Cloud (EC2)
EC2 is the bread and butter of AWS. It is a virtual computing environment that allows you to run applications in the cloud. You can launch instances of these computing environments with different operating systems, programs and packages quickly and easily.
Configure
To get started find the EC2 icon and click it to go to the EC2 Dashboard. I followed this tutorial from Amazon to configure my first several instances. I highly recommend you follow along (and bookmark) this page as it explains the details of each option.
The configuration setting I’ve used are as follows.
- AMI: Ubuntu Server 14.04 – I use an Ubuntu Server because I’m familiar with Ubuntu and how to use the command line interface. Ubuntu plays nicely with many of the tutorials and resources I’ve found on the web for setting up Shiny Server.
- Instance Type: r3.large – R works in memory, so I need to most RAM for the buck, r3 instances are all memory optimized and have the cheapest RAM.
- Configure: all default – Skip this page if you want.
- Storage: 16gb – Default is 8gb, I choose more because I can (no good reason really).
- Tag: ShinyApplication – Tags can be helpful for organizing you AWS services.
- Security Group: http-ssh-anywhere – This part of the configuration tells your instance who is allowed to talk to and access it. These are the rules I use so that I can connect remotely (ssh), and http ports to display web data.
Type | Protocol | Port Range | Source | Info |
---|---|---|---|---|
SSH | TCP | 22 | Anywhere: 0.0.0.0/0 | remote login |
HTTP | TCP | 80 | Anywhere: 0.0.0.0/0 | Use nginx to password protect and reroute |
HTTP | TCP | 3838 | Anywhere: 0.0.0.0/0 | Default shiny port |
HTTP | TCP | 8787 | Anywhere: 0.0.0.0/0 | Default RStudio port |
Before launching you’ll be prompted to download a public key pair. Be sure to save this in a secure and easily accessible location. Review your choices and launch (at the bottom of the page). After a few minutes of setup and initialization your instance is ready to use. Be sure to make a note of the Public IP and Public DNS. You’ll use this to connect to the instance. I’ll use the example IP 11.22.33.44
(whenever you see this replace it with your public IP address), replace publicDNS with your publicDNS.
Connect
To connect to your instance, launch you terminal and use the following template to connect. The first command makes you public key not publicly viewable (only needs to be run the first time you use you SSH key). The second connects to your instance. Replace the name and path with your key’s name and location, replace publicDNS with your public DNS. You may be prompted to proceed.
You should now be connected to your server. The commands are similar to the terminal commands you use on a Mac. pwd
returns your current directory, ls -a
lists all the files in the current directory and cd
will change your directory to the specified directory. Here is a good bash cheat sheet
Disconnect
Type exit
into your terminal and you’ll be logged out of you EC2 instance.
Install R, then some other things
First, we need to update the linux application library. Then we can start installing packages. Reconnect to you instance (if you’ve logged out).
As a quick example, I’ve installed the nginx webserver to this instance. Type your public IP into the url bar of a browser you should get an nginx welcome screen. A nice bit of validation that things are working as they should be. We will be using nginx to password protect our shiny application.
Installing R
Installing the latest version of R isn’t as easy as apt-get R
. Sources and keys need to be added to the source.list
file, then R can be installed (the latest version). The first command sets the sources to the RStudio mirror, then grabs keys to authenticate the installation.
Warning: Be certain to type everything correctly (copy and paste), for the first command. If you receive errors while installing R, navigate to your /etc/apt/sources.list
and use sudo nano
to make any changes, or delete then retry.
Install other things & Shiny Server, part 1
There are a few external libraries that are required to use Shiny Server. Lets install those, the R package shiny
then we will move on to installing Shiny Server. Note: shiny
installs 8 dependencies, it may take a while.
Shiny Server is served on port 3838
. Once these commands have completed go to 11.22.33.44:3838
and you should see a default index.html
page with a shiny
app and rmarkdown
document. The rmarkdown
app is likely showing an error. This is because rmarkdown
isn’t installed on the server. We will fix that shortly (or not, if you won’t need rmarkdown
). For more information on administrating Shiny Sever check the docs.
Read the text on this page to learn where Shiny Server is installed on the server, and where important files are located. For instance, the index.html
is located at /srv/shiny-server/index.html
. This can be modified or replaced with a different file. The sample apps are located at /srv/shiny-server/sample-apps
. You can upload your own apps here (or git clone
/git pull
). Any app in this directory will be added to the page 11.22.33.44:3838/sample-apps/
.
BONUS: RStudio Server (not required, but useful)
I do all my R work in RStudio Server. The GUI is intuitive and easy to use. If you aren’t already using it, I highly recommend it. Working in R on the server is super easy with RStudio installed. Bonus, you can use it if your at an offsite meeting and don’t have your computer, or R isn’t installed on the machine you’re using. It’ll have all the packages you need (if not installing them is as easy as install.packages
, with some caveats). The system requirements are already in place, so this should be easy. Again, check the admin guide for more information. Warning: RStudio defaults to port 8787
, be sure to allow traffic to this port in your security group.
Other things, part 2
It is unlikely that a base R installation will not cut it for the analyses and apps that will be on the server. There are several external libraries that are required by many R packages. Here is a list of (and installation) of many of those libraries on the server. To find out if any applications you need on the server require external libraries check the CRAN information. There is a section called SystemRequirements, if there are any listed, find an appropriate installation for Ubuntu. Warning: it is possible that some of these libraries may fail to install. This is likely due to a bug in the source code. Check Debian/Ubuntu forums for help.
gdal
, geos
, proj
are all used for geographic analysis. They are required for many functions in the rgdal
, rgeos
, sp
, maptools
packages. v8
is a javascript library that is required for some of the htmltools
and htmlwidgets
that I use in some applications, or converting sp
objects to geojson
objects.
Installing R packages is a little different on the server than on a local machine. For packages to be available for all users and roles. Use sudo su - -c "R ..."
when installing packages to make them available for all users. The command below installs rmarkdown
and several useful packages. The example shiny app on the server should work now since rmarkdown
is installed. Warning: if you are using an instance with little memory some packages may exhaust the memory and won’t be able to compile.
Password protecting you apps
The open source version of Shiny Server doesn’t have password protection as a feature, Shiny Server Pro does. If you have the resources or you application is business critical I highly recommend going with Shiny Server Pro. I’ve used this post as a template for password protecting my application. I’ll be using nginx. You can use Apache to password protect the server as well.
There are some server configuration files that need to be changed for the password protection to take effect. First change the nginx config file located at /etc/nginx/sites-available/default
, then change the shiny-server config file at /etc/shiny-server/shiny-server.conf
. Use nano
to open and change the config files. Add the following lines. Comment out any other server config settings in each file (use the hash #
).
nano
takes some getting used to, be sure to use sudo
for permission to make edits to these files. The shiny-server.conf
file needs to have 127.0.0.1
added, thats it.
Now create users and passwords with htpasswd
. You’ll first create a user then, be prompted to enter and confirm a password. Restart the nginx and shiny-server once users are added.
Now, your shiny app is redirected from port 3838
to port 80
, the default http port. Navigate to 11.22.33.44
to get to the login screen. You can’t sneak around to get around authentication by going to 11.22.33.44:3838
In order to completely secure the app it is highly recommend that the app is secured with SSL. This is a little more work and singing an SSL cert, however this is beyond the scope of this post right now. Refer to this post about securing with SSL. A future post will cover SSL in more detail.
Use GitHub to deploy apps
GitHub can be used to add shiny apps to the /srv/shiny-server
directory. This is a great alternative to secure copy (scp
). Dean’s post recommends creating a new git repo on the server, then pushing all your shiny apps to that repo on GitHub from your computer and pulling the repo to the server. I’ve set mine up a little differently. I clone then pull individual shiny apps to the /srv/shiny-server/sample-apps
directory on my instance. This may be more work, but each shiny app is it’s own repo on GitHub, which is nice for managing issues and development. Check out Dean’s shiny server repo on GitHub, it is a very useful example.
scp
data
If you have any flat file data that needs to be loaded for your shiny app you can include it in your GitHub. However, if the data is too large to store in GitHub you can secure copy the data to your server with scp
. There may be some trial and error required, all dependent on file permissions on your server. If you are logged in as root you should be able to write to directories created as that user. It may be a good idea to save this procedure in a text file on your computer (to save typing). I put the data in the home/ubuntu
folder. You can move it wherever you need to from this folder. Be sure that your shiny app is pointing to the data on the server.
Thats all she wrote!
Hopefully you have Shiny Server and RStudio up and running now. It is a bit of work, but it is worth it. This should get you going, I tried to include everything I could think of. Please email questions, comments, or improvements to me, I’ll gladly accept any. In a future post I’ll talk about using Docker to deploy shiny apps on AWS; I think Docker will be my final deployment method for apps for work.
My complete server set up
Here is all the set up code to spin up shiny server to run my apps.
Resources
How to get your very own RStudio Server and Shiny Server with DigitalOcean – Dean Attali Setup Shiny Server on Ubuntu 14.04 – Huidong Tian Running R on AWS – Markus Schmidberger, AWS Blog Logging Process – Huidong Tian Deplyoing Your Very Own Shiny Server – Morgan Benton Running Shiny Server with a Proxy – Ian Pylvainen, RStudio Support
Updates
2016-07-09
One package I need to use frequently is RSQLServer
. The version on CRAN depends on dplyr 0.4.3
, dplyr 0.5.0
has major backend updates that breaks RSQLServer
installation. To install version 0.4.3 use this code. Alternatively, I could use devtools
to install the development version of RSQLServer
, but would rather install the stable version.
2016-07-11
My app was crashing due to some potential websocket issues running apps behind nginx. This issue is fairly common and decently documented on GitHub. The solution was to add a few lines to the default nginx config file at /etc/nginx/sites_enabled/default
. I’ve added those changes to the post.
2016-07-16
Finally got around to including my own index.html
file to act as a landing page for shiny apps on my server. I’ve brushed up on Bootstrap and decided to use it for the landing page. It is a work in progress. It’s on GitHub.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.