pocker: A docker container to integrate R and Python in CI/CD frameworks
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Genesis
I started to use continuous integration with gitlab a few weeks ago and up to a few days was really happy with rocker
image (basically docker
+ R
).
I became ambitious and started to write a markdown
that was comparing R
and Python
speed on simple operations. It was working fine on my laptop (anaconda is installed). However, because anaconda is not available in rocker
image, markdown compilation naturally failed. I thus started the project to create a docker
image that would do the job, i.e. that would integrate Python
and R
together. The container I propose is not well-suited for Python
only repository, its goal is to ease the pass-through between Python
and R
Since I am a beginner in docker
ecosystem, it has not been an easy path. When I was thinking the solution would be trivial to implement I was planning to make the repository private. However, I think now that the solution produced can help people. I decided to make it public. To make the project as reproducible as possible, I ended up with that complex workflow:
github
connected todockerhub
to build image base fromDockerFile
gitlab
with continuous integration using/gitlabCI/.simple_configuration.yml
example file as a reproducible workflowdockerhub
that builds automatically from github repository the docker image
This is not the most natural workflow. If you go into project history, you might see that I did not adopt initially that workflow. I adopted it after merging branches from two separated project that were pursuing the same goal. This complex set up presents an advantage for reproducibility: each time project updates are pushed, the code used to build pocker
image and the example of use from continuous integration is updated.
I should warn people used to create docker image that I might not have created the most parsimonious image necessary to run R
and Python
together. I would welcome pull request to improve pocker
repository
Some explanations
DockerFile
is used to build the image. The main steps are the following:
- Start from
rocker/verse
container that avoids re-installing tidyverse each time a CI/CD job is ran. - Install
python 3
andanaconda
- Add
conda
binary directory in path - Install
reticulate
package
In gitlabCI
directory, you will find scripts useful for continuous integration related to docker
project:
complete_configuration.yml
: the gitlab CI/CD configuration file I was using before building my own docker image. It starts fromrocker/verse
and follows the same steps that theDockerfile
that has been presentedsimple_configuration.yml
: gitlab CI/CD configuration I use now thatpocker
container is built
The other scripts build.R
, scripts/*
are here to propose tests for the configuration obtained from gitlab
CI/CD.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.