Articles by Jozef's Rblog

A guide to retrieval and processing of data from relational database systems using Apache Spark and JDBC with R and sparklyr

August 15, 2020 | Jozef's Rblog

Introduction The {sparklyr} package lets us connect and use Apache Spark for high-performance, highly parallelized, and distributed computations. We can also use Spark’s capabilities to improve and streamline our data processing pipelines, as Spark supports reading and writing from many popular sources such as Parquet, Orc, etc. and most ...
August 10, 2019 | Jozef's Rblog

Using parallelization, multiple git repositories and setting permissions when automating R applications with Jenkins

August 10, 2019 | Jozef's Rblog

Introduction In the previous post, we focused on setting up declarative Jenkins pipelines with emphasis on parametrizing builds and using environment variables across pipeline stages. In this post, we look at various tips that can be useful when automating R application testing and continuous integration, with regards to orchestrating parallelization, ...
