If you are looking for a place to monitor expert forecasts for United States weekly and cumulative COVID-19 deaths, you can’t do any better than the Reich Lab (University of Massachusetts) COVID-19 Forecast Hub. The same is true for data scientists who may be seeking to publish their own forecasts and compare them with the work of their peers. Every Tuesday morning, the Hub publishes four-week, national and state level forecasts from over thirty-five different groups along with its own ensemble forecast. These forecasts can be examined via an effective
D3 interactive visualization, and are also available in a public GitHub repository. The ensemble forecast and all of the submitted forecasts are also passed on to the CDC and the FiveThirtyEight forecast tracker.
The visualization lets users select form a menu of forecasts to compare. By clicking on points marking the actual observed values on past dates, users can compare the accuracy of the various forecasts.
In addition to the visualization, the COVID-19 Forecast Hub has several notable features:
- The COVID-19 Forecast Hub team encourages participation from any group that meets its standards, and provides a showcase for the Community.
- It conducts several automated tests, as well as some “human in the loop” tests, to screen forecasts for inclusion in the ensemble. For this reason, you are unlikely to see forecasts that appear to float untethered from the actual data, or take off on precipitous increases or drastic decreases that appear to be “dramatically out of line with the historical data”.
- In addition to being available in the GitHub repository mentioned above, the details of the forecasts can also be downloaded programmatically from the Zoltar API.
- The ensemble is created by algorithms that score each model and average twenty-three quantiles of the predictive distribution produced by each included forecast.
- The submitted forecasts included in the ensemble cover a wide range of modeling methodologies and model types. There are epidemiological models conditioned on various assumptions about disease transmission and public behavior, as sell as unconditional time series and “curve fitting” models. There are SEIR models, deep learning models, agent based simulations and unique hybrid models.
- In addition to the ensemble, the COVID-19 Forecast Hub team also produces a simple, but surprisingly accurate baseline forecast. (The median predictions of incidence at future time points is the most recently observed incidence.)
For a detailed overview of the statistics and data science underlying the ensemble forecast and COVID-19 Forecast Hub have a look at video of the presentation Nicholas Reich recently gave at the ASA-JDS Webinar Series.
Finally, when you visit the COVID-19 Forecast Hub, please plan to spend some time examining the individual forecasts. The statistical imagination and technique on display is astounding, and there is quite a bit of data science to be learned. This effort is representative of the thousands of statisticians, data scientists, programmers and researchers worldwide who are giving their best to help control this pandemic.