In this blog post, I’ll explain how someone can take advantage of Singularity to make R or Python packages available as an image file to users. This is a necessity if the specific R or Python package is difficult to install across different operating systems making that way the installation process cumbersome. Lately, I’ve utilized the reticulate package in R (it provides an interface between R and Python) and I realized from first hand how difficult it is, in some cases, to install R and Python packages and make them work nicely together in the same operating system. This blog post by no means presents the potential of Singularity or containerization tools, such as docker, but it’s mainly restricted to package distribution / deployment.
Singularity can be installed on all 3 operating systems (Linux, Macintosh, Windows), however the current status (as of July 2018) is that on Macintosh and Windows the user has to setup Vagrant, and run Singularity from there (this might change in the near future).
Singularity on Linux
In the following lines I’ll make use of an Ubuntu cloud instance (the same steps can be accomplished on an Ubuntu Desktop with some exceptions) to explain how someone can download Singularity image files and run those images on Rstudio server (in case of R) or a Jupyter Notebook (in case of Python). I utilize Amazon Web Services (AWS) and especially an Ubuntu server 16.04 using a t2.micro instance (1GB memory, 1 core), however, someone can follow the same procedure on Azure or Google Cloud (at least of those two alternative cloud services I’m aware) as well. I’ll skip the steps on how someone can set-up an Ubuntu cloud instance, as it’s beyond the scope of this blog post (there are certainly many tutorials on the web for this purpose).
Assuming someone uses the command line console, the first thing to do is to install the system requirements (in case of an Ubuntu Desktop upgrading the system should be skipped). Once the installation of the system requirements is finished the following folder should appear in the home directory,
R language Singularity image files
My singularity_containers Github repository contains R and Python Singularity Recipes, which are used to build the corresponding containers. My Github repository is connected to my singularity-hub account and once a change is triggered (for instance, a push to my repository) a new / updated container build will be created. An updated build – for instance for the RGF package – can be pulled from singularity-hub in the following way,
singularity pull --name RGF_r.simg shub://mlampros/singularity_containers:rgf_r
This code line will create the RGF_r.simg image file in the home directory. One should now make sure that port 8787 is not used by another service / application by using,
sudo netstat -plnt | fgrep 8787
If this does not return something then one can proceed with,
singularity run RGF_r.simg
to run the image. If everything went ok and no errors occurred then by opening a second command line console and typing,
sudo netstat -plnt | fgrep 8787
one should observe that port 8787 is opened,
tcp 0 0 0.0.0.0:8787 0.0.0.0:* LISTEN 23062/rserver
The final step is to open a web-browser (chrome, firefox etc.) and give,
- http://Public DNS (IPv4):8787 ( where “Public DNS (IPv4)” is specific to the Cloud instance you launched )
- http://0.0.0.0:8787 ( in case that someone uses Singularity locally )
to launch the Rstudio-server and use the RGF package pre-installed with all requirements included (to stop the service use CTRL + C from the command line). I used RGF as an example here because for me personally, it was somehow cumbersome to install on my windows machine.
The same applies to the other two R singularity recipe files included in my singularity-hub account, i.e. mlampros/singularity_containers:nmslib_r and mlampros/singularity_containers:fuzzywuzzy_r.
Python language Singularity image files
The Python Singularity Recipe files which are also included in the same Github repository utilize port 8888 and follow a similar logic with the R files. The only difference is that when a user runs the image the sudo command is required (otherwise it will raise a permission error),
singularity pull --name RGF_py.simg shub://mlampros/singularity_containers:rgf_python sudo singularity run RGF_py.simg
The latter command will produce the following (example) output,
The web-browser runs on localhost:8888 [I 09:56:03.427 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret [W 09:56:03.779 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended. [I 09:56:03.789 NotebookApp] Serving notebooks from local directory: /root [I 09:56:03.790 NotebookApp] The Jupyter Notebook is running at: [I 09:56:03.790 NotebookApp] http://(ip-172-31-21-76 or 127.0.0.1):8888/?token=1fc90f01247498dac8d24ac918fe8da57fa46ee9e98eea4f [I 09:56:03.790 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 09:56:03.790 NotebookApp] Copy/paste this URL into your browser when you connect for the first time, to login with a token: http://(ip-172-31-21-76 or 127.0.0.1):8888/?token=1fc90f01247498dac8d24ac918fe8da57fa46ee9e98eea4f .......
In the same way as before the user should open a web-browser and give either,
- http://Public DNS (IPv4):8888 ( where “Public DNS (IPv4)” is specific to the Cloud instance you launched )
- http://127.0.0.1:8888 ( in case that someone uses Singularity locally )
When someone connects for the first time to the Jupyter notebook then he / she has to give the output token as the password. For instance, based on the previous example output the token password would be 1fc90f01247498dac8d24ac918fe8da57fa46ee9e98eea4f.
The same applies to the other two Python singularity recipe files included in my singularity-hub account, i.e. mlampros/singularity_containers:nmslib_python and mlampros/singularity_containers:fuzzywuzzy_python.
If someone intends to add authentication to the Singularity recipe files then valuable resources can be found in the https://github.com/nickjer/singularity-rstudio Github repository, on which my Rstudio-server recipes heavily are based.
An updated version of singularity_containers can be found in my Github repository and to report bugs / issues please use the following link, https://github.com/mlampros/singularity_containers/issues.