The COVID19 pandemic has raised the profile of public health workers at all levels from the nurses and doctors working on the front lines at our hospitals, to high level state and federal public health officials. I think its a good bet that eighteen months ago few of us had any clear idea about how the public health care system works, or thought much about the people charged with the awesome responsibility to keep us safe. We are all a little bit wiser now. It strikes me as obvious that we will have a continuing need to improve our public health systems and that this need will create opportunities for data scientists to make significant contributions, in research, logistics, data management, reporting, public communication and many more areas. The recent post by my colleague Jesse Mostipak describes how R and Shiny made a big difference in the nitty gritty work required to roll out vaccine distribution in West Virginia. This four minute video about how the West Virginia Army National Guard built a COVID vaccine inventory management system is inspiring.
The following are some R resources that you may find helpful if you are seeking to increase your R skills with a eye toward public health applications. Some may be useful to public health professionals seeking to learn R, others may be interesting to R users who want to investigate data science in a public health context.
If you are feeling optimistic about traveling and can make your to Estonia this summer you might consider Statistical Practice in Epidemiology using R.
If you hurry, you may be able to get into the Coursera course offered by Imperial College London Statistical Analysis with R for Public Health Specialization.
There are several Coursera Courses for R in a public health context.
Frank Harrell’s free online course Biostatistics for Biomedical Research, available on YouTube: BBRcourse, is an excellent introduction to the basic statistical concepts underlying all medical applications.
If you can learn on your own with the help of a good book, here are some ideas.
Modern Statistics for Modern Biology: This book is focused more on genomics than public health applications, but it is probably the best introductory statistical text available. The modern statistics part in the title means statistics as a data driven, computational based science. Every topic is illustrated with well-crafted R code and visualizations. The book is great read.
Population Health Data Science with R: An introduction to R by authors who see population health as a systems framework for studying and improving the health of populations through collective action and learning.
Epidemiology with R: This is a new, reasonably priced book that covers the basics. It emphasizes reproducibility with R Markdown. Code examples are written in a base R style that matches the extensively used Epi package.
If working through a book on you own seems too much of a stretch, then digest an R package or two. Here are a few examples of packages on public health themes with sufficient documentation to make interesting self-learning projects.
EpiModel: An R Package for Mathematical Modeling of Infectious Disease over Networks
PHEindicators: Common Public Health Statistics and their Confidence Intervals
SimInf: An R Package for Data-Driven Stochastic Disease Simulations
SPARSEMODr: Construct spatial, stochastic disease models that show how parameter values fluctuate in response to public health interventions
Finally, if you find yourself in a situation similar to the West Virginia Army National Guard team featured in the video, you may just want to teach yourself Shiny. The RStudio Shiny Tutorial, along with Hadley Wickham’s book Mastering Shiny, is a very good place to start. If you need more structure, you might check out the Udemy Courses, or work through the online workshops from Duke University or the University of Manchester. And by all means, immerse yourself in the examples, posts and podcasts of the Shiny Developer Series.