Live COVID-19 Swiss vaccination analysis
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A new live R Shiny application in our gallery: COVID-19 vaccination breakthroughs in Switzerland.
In the past we had written a couple of articles (on 20/10/2021 and 06/12/2021) about the COVID-19 Vaccination breakthroughs in Switzerland with the promise to publish them periodically on our site. We have decided instead to make this analysis a live dashboard article integrated in our gallery, that reads every day the data from BAG (Swiss Federal Office for Public Health) to report always the most up-to-date vaccination figures.
Compared to the previous articles, the Vaccinated group is now split into 3 categories to account for the addition of Booster vaccinations:
- Fully Vaccinated with Booster
- Fully Vaccinated without Booster
- Partially Vaccinated
The categories above are compared against the Unvaccinated group to evaluate the vaccination benefit.
Hospitalizations and Death rates within the 4 populations are compared to derive who is more at risk. The following measures are shown in the article:
- Hospitalizations and Deaths counts
- Hospitalizations and Deaths counts per 100’000 people
- ratio of the latter measure between the Unvaccinated and Vaccinated groups.
Rather than focusing on the content of the article, in this post we would like to describe the process and architecture of the deployment that allows us to:
- update data constantly in a controlled way
- use interactive Shiny components in an R Markdown document
- use shinyapps.io for hosting the live version of the article
- safely deploy with a process orchestrated by CI/CD workflow using GitHub Actions.
For a better illustration and understanding, the source code is publicly available in our GitHub repository covid19-vaccination-ch.
Reading BAG data
We are interested in collecting the weekly BAG reports about vaccination breakthroughs.
Thanks to the well maintained data documentation we can easily identify what we want to read. The R package jsonlite is all we need to read from the exposed API.
bag_api_url <- "https://www.covid19.admin.ch/api/data/context/" bag_sources <- jsonlite::fromJSON(bag_api_url) str(bag_sources, max.level = 2, strict.width = "cut") ## List of 3 ## $ sourceDate : chr "2022-03-08T06:04:50.000+01:00" ## $ dataVersion: chr "20220308-cyc99ifc" ## $ sources :List of 6 ## ..$ comment : chr "OpenData DCAT-AP-CH metadata is now available as well".. ## ..$ opendata :List of 3 ## ..$ schema :List of 2 ## ..$ readme : chr "https://www.covid19.admin.ch/api/data/documentation/" ## ..$ zip :List of 2 ## ..$ individual:List of 2
bag_sources is an R list containing all links to the JSON sources mentioned in the documentation. As an example, the code below shows how to read weekly Hospitalizations by vaccination status for different age classes, which can be found in
source_weekly_by_age <- bag_sources$sources$individual$json$weekly$byAge str(source_weekly_by_age, strict.width = "cut") ## List of 8 ## $ cases : chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ casesVaccPersons: chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ hosp : chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ hospReason : chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ hospVaccPersons : chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ death : chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ deathVaccPersons: chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. ## $ test : chr "https://www.covid19.admin.ch/api/data/20220308-cyc".. source_weekly_hosp_by_age_vacc <- source_weekly_by_age$hospVaccPersons weekly_hosp_by_age_vacc <- jsonlite::fromJSON(source_weekly_hosp_by_age_vacc) str(weekly_hosp_by_age_vacc, strict.width = "cut") ## 'data.frame': 3828 obs. of 14 variables: ## $ date : int 202104 202104 202104 202104 202104 202104 20210.. ## $ altersklasse_covid19: chr "0 - 9" "0 - 9" "0 - 9" "0 - 9" ... ## $ vaccination_status : chr "fully_vaccinated" "partially_vaccinated" "not".. ## $ entries : int 0 0 6 2 0 0 1 1 0 0 ... ## $ sumTotal : int 0 0 6 2 0 0 1 1 0 0 ... ## $ pop : int 6 10 880571 NA 51 988 852336 NA 419 7590 ... ## $ inz_entries : num 0 0 0.68 NA 0 0 0.12 NA 0 0 ... ## $ geoRegion : chr "CHFL" "CHFL" "CHFL" "CHFL" ... ## $ type : chr "COVID19Hosp" "COVID19Hosp" "COVID19Hosp" "COV".. ## $ type_variant : chr "vaccine" "vaccine" "vaccine" "vaccine" ... ## $ vaccine : chr "all" "all" "all" "all" ... ## $ data_completeness : chr "limited" "limited" "limited" "limited" ... ## $ version : chr "2022-03-08_06-04-50" "2022-03-08_06-04-50" "2".. ## $ timeframe_all : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
For our scope we must also read Infections per age group and Deaths entries per vaccination status and age group. They are available from other elements of the
The article should show the data from the latest weekly reports from BAG, updated as of today and then aggregated over the past 4 weeks.
The R package
The source code is structured as an R package called
covid19vaccinationch. The R Markdown article (
inst/report/index.Rmd) is part of the installed package and utilizes its functions.
The package can be installed locally by executing
and exposes a function
run_report() that renders
rmarkdown::run() and generates the HTML report
The data are stored in 3 RDS files in the
inst/bag_data source folder, and installed alongside the R Markdown article as part of the package.
renv to control the set of package dependencies.
Data update and CI/CD
BAG releases new data every day around 1:30pm CET/CEST, this daily update would also report with delay older cases from the past weeks and therefore update the results of our article. For this reason there is the need to query the data from source every day to show always the most up-to-date report. Furthermore, we would like to avoid the data reading and processing steps every time in order to load the report faster for the users.
The package contains a function
build_data() that constructs the 3 main data sets required by the article storing them in
inst/bag_data as RDS files.
The GitHub Action workflow (defined in
build_data() on the
main branch every day at 1PM UTC (GitHub Action scheduling is based on UTC time), and, if new data from the past weeks are found, the updated RDS files are pushed to the repository, making the latest data available to the deployed application. We must also consider that a non-backwards compatible data structure change from BAG may compromise the rendering of the article, for this reason, upon any new data introduction, the package must be checked as part of the Continuous Integration / Deployment GitHub Actions workflow before pushing the data to the repository. In such a broken case the “R CMD check” step of the workflow will fail preventing any deployment to shinyapps.io, and the report will show the latest working data until the package has been made compatible with the new data structure.
The main steps executed sequentially by the workflow are:
covid19vaccination::build_data()on schedule to fetch and build updated data
- Continuous Integration: tests via
R CMD check, verifying that new data are compatible and work as expected
- Continuous Deployment upon successful
R CMD check:
- Commit and push RDS files if changes are found
- Deploy to shinyapps.io
Going more into details, the step “Fetch and rebuild latest BAG data” and “Commit and push updated BAG data” of the GitHub Action reacts on a
on: schedule: - cron: "0 13 * * 1-5" # 13 because UTC, it corresponds to 14 CET
The 5 required entries of
cron define the minutes, hours, days of month, months and day of week of the scheduled event, where an
* indicates no constraint on a certain time. Our schedule triggers the workflow at 1PM UTC every day excluding Saturday and Sunday (
cron), when BAG provides no update. More patterns can be created with the schedule event, see the corresponding guide.
Rendering R Markdown
The article contains both
plotly graphs and
shiny interactive charts (the line plots). R Markdown allows using Shiny widgets to create an interactive report using
runtime: shiny. However, this requires the full re-rendering of the document (including the non-interactive parts) for each user session, and can therefore result in a slow performance.
runtime: shiny_prerendered is available since
rmarkdown and has major performance advantages compared to
runtime: shiny. Using
shiny_prerendered allows to split rendering of UI elements and load / manipulation of data from the interactive server logic for end users. As a result, most of the code is run only once when the document is (pre-)rendered (R Markdown, UI elements, data caching) and only some code is run for every user interaction (Shiny server logic).
Deployment to shinyapps.io
Deploying to shinyapps.io usually requires in the project directory an
app.R file that runs the Shiny App, however it’s also possible to deploy an
Rmd file called
index.Rmd that will be served as the default document for the directory and recognized by shinyapps.io (see documentation).
We have provided a public repository where we show an example of how to safely deploy to shinyapps.io the automated analysis of COVID-19 vaccination breakthroughs in Switzerland by means of an R package containing an R Markdown document and up-to-date data. We have highlighted the benefits of making use of the
shiny_prerendered runtime for R Markdown, and of programmatically fetching / updating the data as part of a GitHub Actions CI-CD workflow, with the goal to save reading time when loading the page and to have always the latest and compatible data available in a controlled fashion.
Feel free to get in touch at [email protected] if you have any question or any suggestion for further enhancements.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.