Epiverse community engagement and software sustainability for research software

[This article was first published on Epiverse-TRACE: tools for outbreak analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Software that is developed for research or by researchers can be difficult to maintain given the incentive and funding structures in academia. This remains true for epidemiology, with a large volume of software written during the COVID-19 pandemic, much of which is now abandonware1. This does not mean that the software developed to understand the COVID-19 pandemic was bad or does not have utility in understanding future epidemics and pandemics, but just that the capacity to maintain and further develop these tools is not available now the pandemic is no longer considered an acute public health emergency.

These issues around software sustainability and the academic structures that hinder software longevity were raised by Kucharski, Funk, and Eggo (2020) and were one of the leading reasons for the Epiverse-TRACE initiative. Alongside the developing novel software (R packages), Epiverse also has a commitment to support the community of package developers in epidemiology and outbreak analytics. The initiative also tries to improve community collaboration and contribution friendliness of open-source software.

This blog post highlights some recent work by Epiverse software engineers to collaborate on research software, or researchware, to help improve an R package that was initially written in the early days of the COVID-19 pandemic (January 2020 – May 2020) to assess the effectiveness of isolation and contact tracing effectiveness (Hellewell et al. 2020). It built on code written for the 2014-2016 West Africa Ebola outbreak to provide insights into ring vaccination (Kucharski et al. 2016). These applications and the general nature of the questions the package addresses suggest that it could be of great help in future infectious disease outbreaks, but has lacked developer resources without pandemic-related priorities.

The R package

The R package in question is {ringbp}. The package has two pieces of functionality: 1) to simulate an infectious disease outbreak using a branching process model with non-pharmaceutical interventions; and 2) to calculate the proportion of simulated outbreaks that are contained (i.e. do not cause a large sustained human-to-human epidemic). The utility of the package’s general model framework has been shown by serving as a template for other epidemiological research such as post-exposure prophylaxis, network effects on control (Firth et al. 2020) and the impact of self-reporting and isolation adherence (Davis et al. 2020).

The problem

It is understandable that because {ringbp} was written in haste to produce insights to inform pandemic response it did not adhere to all software best practices. Usability, documentation, testing, code style and (computational) performance could be improved. Certain aspects of model code, like parameterisations, were hard-coded, not providing users the full flexibility that the model could allow.

Epiverse contribution

In the recent months Epiverse has collaborated with {ringbp} developers Seb Funk (also a member of Epiverse) and Carl Pearson (external collaborator), based at the London School of Hygiene and Tropical Medicine and University of North Carolina, respectively, to try and improve the R package, both internally and from the user-experience. The following sections will give brief summaries of some of the collaborative developments.

User interface

The user experience (API) of the package has been refactored. The main simulation function scenario_sim() remains, but its arguments have been modularised to better group model parameters and control arguments. This also makes the package easier to develop further without necessarily introducing many breaking changes and prevents the number of top-level function arguments from expanding.

Old

scenario_sim(
  n.sim = 5,
  num.initial.cases = 5,
  cap_max_days = 365,
  cap_cases = 2000,
  r0isolated = 0,
  r0community = 2.5,
  disp.iso = 1,
  disp.com = 0.16,
  k = 0.7,
  delay_shape = 2.5,
  delay_scale = 5,
  prop.asym = 0,
  prop.ascertain = 0
)

New

scenario_sim(
 n = 5,
 initial_cases = 5,
 offspring = offspring_opts(
   community = \(n) rnbinom(n = n, mu = 2.5, size = 0.16),
   isolated = \(n) rnbinom(n = n, mu = 0, size = 1),
   asymptomatic = \(n) rnbinom(n = n, mu = 2.5, size = 0.16)
  ),
 delays = delay_opts(
   incubation_period = \(n) rweibull(n = n, shape = 2.32, scale = 6.49),
   onset_to_isolation = \(n) rweibull(n = n, shape = 2.5, scale = 5)
  ),
 event_probs = event_prob_opts(
   asymptomatic = 0,
   presymptomatic_transmission = 0.3,
   symptomatic_ascertained = 0
  ),
 interventions = intervention_opts(quarantine = TRUE),
 sim = sim_opts(
   cap_max_days = 365,
   cap_cases = 2000
  )
)

The new API gives the user more control over the model’s parameterisation. The incubation period is now specified by the user instead of being set to an estimate for COVID-19. The way offspring and delay distribution functions are specified also means that any distributional or non-parametric form can be supplied, relaxing the assumption that the onset-to-isolation has to be a Weibull distribution.

Users can now specify the proportion of presymptomatic transmission rather than having to understand the skew normal parameterisation used by the simulation model, making it easier to get started with the package for new users.

Lastly on user-facing changes, the naming and style of function arguments has been standardised for consistent use of snakecase style and abbreviations.

Documentation

Function documentation already used {roxygen2}, but did not make use of inheritance or comprehensively document the function output or usage. We used @inheritParams from {roxygen2} to deduplicate, added @return documentation to all functions. We also improved the function argument documentation by following a structure of: <type>: description, for example:

@param sim a `list` with class `<ringbp_sim_opts>`: the simulation control
  options for the \pkg{ringbp} model, returned by [sim_opts()]

Exported functions now have informative examples (@examples) to showcase how the functions should be used. Function examples now always run (removing \dontrun{}) to catch any errors.

The {roxyglobals} package has been added to automate the management of global variables with the use of the @autoglobal tag.

Vignettes are useful long-form package documentation. Thus far we’ve added one vignette to the package and plan to add more where relevant.

Bug fixes

Perhaps more important that any of the software best practices and user interface is the correctness of the code. In our developments we’ve uncovered a few bugs in the previous version of {ringbp}. Errors in the timing of quarantining infected individuals, sampling from the onset-to-isolation distribution, and calculating the generation time from the incubation period have all been identified and fixed.

Testing

  • simulation correctness regression (snapshot) testing

Miscellaneous

There are various other changes in {ringbp} from our work. Examples include: input checking, not specifying erroneous function defaults, updating the package website, and functions that return data.table objects no longer returning silently. Mentioned in the introduction, model performance has been incrementally improved, but we’ve not focused on this aspect, and the package will benefit from time spent focusing on this in the future; especially if the set and complexity of non-pharmaceutical interventions in the model expands.

Conclusion

The {ringbp} R package implements a simple but informative model for infectious disease transmission and interventions. When originally written it included many well-developed aspects, but the time constraints of real-time outbreak response meant several improvements were possible.

Epiverse-TRACE has the opportunity to not only develop new tooling for pandemic preparedness and response, but to contribute to the ecosystem of open-source software in infectious disease epidemiology. We hope that by covering the collaborative developments of {ringbp}, it can illustrate the benefits of bringing software up to date with best practices, and make tools available, accessible and robust when a new epidemic or pandemic occurs, in turn hopefully removing the need for redeveloping similar software in the future.

Enhancing the accessibility of software for users and developers by improving its documentation and user interface will hopefully provide a gateway for more external contributors to engage with the project. In the public health landscape of temporal surges in capacity and priorities, better enabling community contributions to open-source software should aid software sustainability.

All of the changes discussed in this blog post can be found in the {ringbp} news. For details of developments see the pull request history of {ringbp} on GitHub.

Acknowledgements

Thanks to Seb Funk and Carl Pearson for helpful feedback when drafting this post and for their collaboration on the {ringbp} project.

References

Davis, Emma L., Tim C. D. Lucas, Anna Borlase, Timothy M. Pollington, Sam Abbott, Diepreye Ayabina, Thomas Crellen, et al. 2020. “An Imperfect Tool: Contact Tracing Could Provide Valuable Reductions in COVID-19 Transmission If Good Adherence Can Be Achieved and Maintained.” https://doi.org/10.1101/2020.06.09.20124008.
Firth, Josh A., Joel Hellewell, Petra Klepac, Stephen Kissler, CMMID COVID-19 Working Group, Mark Jit, Katherine E. Atkins, et al. 2020. “Using a Real-World Network to Model Localized COVID-19 Control Strategies.” Nature Medicine 26 (10): 1616–22. https://doi.org/10.1038/s41591-020-1036-8.
Hellewell, Joel, Sam Abbott, Amy Gimma, Nikos I Bosse, Christopher I Jarvis, Timothy W Russell, James D Munday, et al. 2020. “Feasibility of Controlling COVID-19 Outbreaks by Isolation of Cases and Contacts.” The Lancet Global Health 8 (4): e488–96. https://doi.org/10.1016/S2214-109X(20)30074-7.
Kucharski, Adam J., Rosalind M. Eggo, Conall H. Watson, Anton Camacho, Sebastian Funk, and W. John Edmunds. 2016. “Effectiveness of Ring Vaccination as Control Strategy for Ebola Virus Disease.” Emerging Infectious Diseases 22 (1): 105–8. https://doi.org/10.3201/eid2201.151410.
Kucharski, Adam J., Sebastian Funk, and Rosalind M. Eggo. 2020. “The COVID-19 Response Illustrates That Traditional Academic Reward Structures and Metrics Do Not Reflect Crucial Contributions to Modern Science.” PLOS Biology 18 (10): e3000913. https://doi.org/10.1371/journal.pbio.3000913.

Footnotes

  1. Defined by Cambridge Dictionary as: “software that is no longer produced or supported by the company that originally made it”.↩︎

Reuse

Citation

BibTeX citation:
@online{w._lambert2025,
  author = {W. Lambert, Joshua},
  title = {Epiverse Community Engagement and Software Sustainability for
    Research Software},
  date = {2025-08-25},
  url = {https://epiverse-trace.github.io/posts/epi-community-contrib/},
  langid = {en}
}
For attribution, please cite this work as:
W. Lambert, Joshua. 2025. “Epiverse Community Engagement and Software Sustainability for Research Software.” August 25, 2025. https://epiverse-trace.github.io/posts/epi-community-contrib/.
To leave a comment for the author, please follow the link and comment on their blog: Epiverse-TRACE: tools for outbreak analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)