The Delphi epidemiological forecasting group at Carnegie Mellon University is undertaking a massive effort to develop leading indicators for COVID-19 outbreaks, and if you are an R or Python developer you can help. Delphi is working with both Facebook and Google to analyze the data from daily surveys that ask respondents if they (or people they know) are experiencing COVID like symptoms. The responses permit Delphi to construct a % CLI in-community signal at the county level across the United States that is being used to improve forecasts and inform public health officials. The Facebook Survey reaches approximately 74,000 people each day, and at its peak, over 1.2 million people responded to the Google Survey in a single day.
The aggregated data is publicly available daily through Delphi’s COVIDcast API, and fully de-identified individual survey responses available to researchers who agree to the data use terms. Also note that the data from the Facebook survey is never seen by Facebook. The survey is advertised through Facebook but hosted on a Delphi platform.
You can view the dashboard for the COVIDcast real-time COVID-19 indicators.
This is open data science at its best. It is a sophisticated project for the public good: conceived and managed by experts, informed by big data and careful about data privacy that makes its work publicly available – and you can become a part of it. The COVIDcast API is easily accessible through both R covidcast and Python covidcast packages and the Delphi team would welcome your help working through the issues on their GitHub repo. Issues are categorized as being relevant to either R or Python, and several are flagged as good first issues.
One outstanding aspect of the Delphi Group is that this team really makes an effort to communicate. They not only make their data and results available, they also try to help people understand the data science. They share their ideas, delve into the underlying statistics, present the challenges, and describe what’s working and what’s not. Delphi recently started a blog, and I don’t think you will find a better chronicle of data science in action. My favorite post so far, Can Symptoms Surveys Improve COVID-19 Forecasts?, can be found here.