In the times of pandemic, the data community can help in many ways, including by developing instruments to track and break down the data on the spread of the dreaded coronavirus disease.
The COVID-19 Canada Data Explorer app was built with R, including Shiny and Leaflet, to process the official dataset available from the Government of Canada. They do have their own data visualization tool, but it is very basic. You can do so much more with the available data!
So without further ado, just click on the picture below, and it will take you to the dashboard.
I hope that my app will help public health professionals, policymakers, and really anyone to stay informed about the course of SARS-CoV-2 epidemic in Canada. I am planning to keep improving the app’s functionality over time, possibly adding more in-depth breakdown by provinces (if I can find the data) and/or some basic statistical modeling options. Since the app is in active development, I will publish the source code just a bit later – as soon as I get closer to the app’s final version.
Some Things to Keep in Mind (Technical Stuff Here)
The app runs on a free version of Shiny Server, which doesn’t allow to serve Shiny applications using SSL/TLS encryption. Hence HTTP connection instead of HTTPS – your browser may say that “connection is not secure”. I am aware that there may be a way to force connection through HTTPS using reverse proxy, but I was not able to figure it out. If you know how to do this on Apache (not Nginx), I’d very much appreciate your advice (please use this contact form to reach me, or just comment on the post).
The data is downloaded from Canada.ca at 2-hour intervals, provided it has more recent timestamps. This means that there may be a delay of up to two hours from the time Canada.ca update their data to the moment it is updated on my server.
If data for the chosen indicator (most likely, “new cases”) is not available for the selected date, an error message will flash briefly, and then the date will revert back to the last day for which the data is available for this indicator.
“Cases per 100,000” indicator shows the overall prevalence per 100,000 population over the time of pandemic, not the number of people who are currently ill (i.e. without account for the people who have recovered).
“Tests done per 1,000” indicator shows exactly what it says – the number of tests per 1,000 people, not the percentage or share of people tested. Keep in mind that one person can be tested multiple times.
“Case fatality rate” is a percent of those who have died among the diagnosed cases, not of all who may have got infected or ill (the number we simply don’t know at this point). Indeed, based on the early results of antibody testing of the general population [1, 2], it seems likely that the infection rates are much higher, and the case fatality rates are thus much lower than we currently think. However, large-scale randomized serological testing will be required to figure out the actual fatality rates. Also note that case fatality rate should not be confused with mortality rate.