rOpenSci News Digest, March 2023
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Dear rOpenSci friends, it’s time for our monthly news roundup!
You can read this post on our blog. Now let’s dive into the activity at and around rOpenSci!
Meeting the stars of the R-universe: Sébastien Rochette
Knowing our community’s stories helps us to learn about the people behind our software, brings us closer and offers us new opportunities. To share some of these community stories, we created the rOpenSci interview series “Meeting the stars of the R-Universe”.
The latest interview with Sébastien Rochette introduces ThinkR’s Approach to Contributing to a Growing and Friendly R Community. The post is available in Spanish and French too! Don’t miss the trilingual post and the video.
Discovering and learning everything there is to know about R packages using R-universe
Jeroen Ooms explains how to use R-universe to discover and assess new packages. He wrote that we can distinguish three levels of navigation in the R-universe when you go shopping for R packages:
- Search the global ecosystem: find packages, by topic, keyword, ranking, etc.
- Browse by maintainer/organization: explore all work from a given group or developer.
- The individual package: get detailed information on everything there is to know about a project and instructions for how to start using it.
That post was also discussed on the R Weekly highlights podcast hosted by Eric Nantz and Mike Thomas!
Join us for social coworking & office hours monthly on first Tuesdays! Hosted by Steffi LaZerte and various community hosts. Everyone welcome. No RSVP needed. Consult our Events page to find your local time and how to join.
- Tuesday, Apr 4th, 14:00 European Central / 12:00 UTC “Working with taxonomic lists in R” Hosted by community host Miguel Alvarez and Steffi LaZerte
And remember, you can always cowork independently on work related to R, work on packages that tend to be neglected, or work on what ever you need to get done!
- Tuesday, May 2nd, 9:00 Americas Pacific / 16:00 UTC Tentative theme: “Spring Cleaning for R packages and scripts” Hosted by community host TBD and Steffi LaZerte
- Explore how other organizations keep their scripts/packages nice and clean
- Take a look at your R packages and scripts and give them a good spring cleaning*
- Talk to our community host and other attendees and discuss tips for keeping on top of it all.
* in the northern hemisphere at least, otherwise, give them a good fall cleaning!
The following four packages recently became a part of our software suite:
openalexR, developed by Massimo Aria together with Trang Le: A set of tools to extract bibliographic content from OpenAlex database using API https://docs.openalex.org. It is available on CRAN. It has been reviewed by Brianna Lind and Pachá (aka Mauricio Vargas Sepúlveda).
rb3, developed by Wilson Freitas together with Marcelo Perlin: Download and parse public files released by B3 and convert them into useful formats and data structures common to data analysis practitioners. It is available on CRAN. It has been reviewed by Mario Gavidia Calderón and Pachá (aka Mauricio Vargas Sepúlveda).
tsbox, developed by Christoph Sax: Time series toolkit with identical behavior for all time series classes: ts,xts, data.frame, data.table, tibble, zoo, timeSeries, tsibble, tis or irts. Also converts reliably between these classes. It is available on CRAN. It has been reviewed by Cathy Chamberlin, and Nunes Matt.
waywiser, developed by Michael Mahoney: Assessing predictive models of spatial data can be challenging, both because these models are typically built for extrapolating outside the original region represented by training data and due to potential spatially structured errors, with “hot spots” of higher than expected error clustered geographically due to spatial structure in the underlying data. Methods are provided for assessing models fit to spatial data, including approaches for measuring the spatial structure of model errors, assessing model predictions at multiple spatial scales, and evaluating where predictions can be made safely. Methods are particularly useful for models fit using the tidymodels framework. It is available on CRAN. It has been reviewed by Virgilio Gómez-Rubio, and Jakub Nowosad.
Discover more packages, read more about Software Peer Review.
The following fifteen packages have had an update since the last newsletter: c14bazAAR (
3.4.1), dynamite (
1.2.0), FedData (
v3.0.3), geojsonio (
v0.11.0), lingtypology (
v1.1.12), mctq (
v0.3.2), osmdata (
v0.2.1), pathviewr (
v1.1.7), qualR (
v0.9.7), rredlist (
v0.7.1), spocc (
v1.2.1), tarchetypes (
0.7.5), targets (
0.14.3), webmockr (
v0.9.0), and xslt (
Software Peer Review
There are fifteen recently closed and active submissions and 2 submissions on hold. Issues are at different stages:
Two at ‘6/approved’:
openalexR, Getting Bibliographic Records from OpenAlex Database Using DSL. Submitted by Trang Le.
tsbox, Class-Agnostic Time Series. Submitted by Christoph Sax. (Stats).
Three at ‘4/review(s)-in-awaiting-changes’:
concstats, Market Structure, Concentration and Inequality Measures. Submitted by Andreas Schneider.
wmm, World Magnetic Model. Submitted by Will Frierson.
octolog, Better Github Action Logging. Submitted by Jacob Wujciak-Jens.
Four at ‘3/reviewer(s)-assigned’:
credit, Generate CRediT Author Statements. Submitted by Josep Pueyo-Ros.
predictNMB, Evaluate Clinical Prediction Models by Net Monetary Benefit. Submitted by Rex Parsons. (Stats).
dfms, Dynamic Factor Models. Submitted by Sebastian Krantz.
waywiser, Ergonomic Methods for Assessing Spatial Models. Submitted by Michael Mahoney. (Stats).
Four at ‘2/seeking-reviewer(s)’:
birdsize, Estimate Avian Body Size Distributions. Submitted by Renata Diaz.
dwctaxon, Tools for Working with Darwin Core Taxon Data. Submitted by Joel Nitta.
ohun, Optimizing Acoustic Signal Detection. Submitted by Marcelo Araya-Salas.
bssm, Bayesian Inference of Non-Linear and Non-Gaussian State Space. Submitted by Jouni Helske. (Stats).
Two at ‘1/editor-checks’:
pangoling, Access to Large Language Model Predictions. Submitted by Bruno Nicenboim.
qualtdict, Generating Variable Dictionaries and Labelled Data Exports of Qualtrics. Submitted by lyh970817.
Find out more about Software Peer Review and how to get involved.
On the blog
rOpenSci Champions Program Kick off by Yanina Bellini Saibene. The champions program has already started the first activities of 2023. Read where the participants are from and what they will be doing.
Puntapié inicial de nuestro programa de campeonas y campeones by Yanina Bellini Saibene. El programa de campeones y campeonas ya inició las primeras actividades de este 2023. Lee de donde son los y las participantes y que van a estar haciendo.
Meeting the Stars of the R-universe: ThinkR’s Approach to Contributing to a Growing and Friendly R Community by Yanina Bellini Saibene, Sébastien Rochette, Alejandra Bellini, Lucio Casalla, and Steffi LaZerte. A new installment of our interview series “Meeting the stars of the R-Universe”. We go to France to get a closer look at the work of the people at ThinkR.
Aprender, ayudar y compartir. El método de ThinkR para crear una comunidad cada vez más grande y amigable de R by Yanina Bellini Saibene, Sébastien Rochette, Alejandra Bellini, Lucio Casalla, and Steffi LaZerte. Una nueva entrega de nuestra serie de entrevistas “Conociendo a las estrellas del universo R”. Nos vamos a Francia para conocer más de cerca el trabajo que hace la gente de ThinkR.
Enseigner, aider et partager. L’approche de ThinkR pour contribuer à la croissance d’une communauté R conviviale by Yanina Bellini Saibene, Sébastien Rochette, Alejandra Bellini, Lucio Casalla, and Steffi LaZerte. Un nouvel entretien de notre série “Meeting the stars of the R-Universe”. Nous allons en France voir de plus près le travail de ThinkR.
Discovering and learning everything there is to know about R packages using r-universe by Jeroen Ooms. The goal of r-universe is to provide a central place for browsing through the R ecosystem to discover what is out there, get a sense of the purpose and quality of individual packages, and help you get started in seconds.
Descubrir y aprender todo lo que hay que saber sobre los paquetes de R utilizando r-universe by Jeroen Ooms. El objetivo de r-universe es proporcionar un lugar central para navegar por el ecosistema de R y descubrir lo que existe; hacerse una idea de la finalidad y la calidad de cada paquete, y ayudar a empezar en cuestión de segundos.
Call for (co)maintainers
Call for maintainers
If you’re interested in maintaining any of the R packages below, you might enjoy reading our blog post What Does It Mean to Maintain a Package? (or listening to its discussion on the R Weekly highlights podcast hosted by Eric Nantz and Mike Thomas)!
rvertnet, Retrieve, map and summarize data from the VertNet.org archives (http://vertnet.org/). Functions allow searching by many parameters, including taxonomic names, places, and dates. In addition, there is an interface for conducting spatially delimited searches, and another for requesting large datasets via email. Issue for volunteering.
natserv. Interface to NatureServe (https://www.natureserve.org/). Includes methods to get data, image metadata, search taxonomic names, and make maps. Issue for volunteering.
sofa. Provides an interface to the NoSQL database CouchDB (http://couchdb.apache.org). Methods are provided for managing databases within CouchDB, including creating/deleting/updating/transferring, and managing documents within databases. One can connect with a local CouchDB instance, or a remote ‘CouchDB’ databases such as Cloudant. Documents can be inserted directly from vectors, lists, data.frames, and JSON. Targeted at CouchDB v2 or greater. Issue for volunteering.
citesdb, a high-performance database of shipment-level CITES trade data. Provides convenient access to over 40 years and 20 million records of endangered wildlife trade data from the Convention on International Trade in Endangered Species of Wild Fauna and Flora, stored on a local on-disk, out-of memory ‘DuckDB’ database for bulk analysis. Issue for volunteering.
Call for comaintainers
rtweet, that interfaces Twitter API, is looking for a co-maintainer.
Refer to our recent blog post to identify other packages where help is especially wished for!
Package development corner
Some useful tips for R package developers. 👀
R Consortium’s call for proposals!
The R Consortium’s Internal Steering Committee has a call for proposals open until April 1st.
- The funds can be used for different sizes of projects. The project must have a software development component.
- Proofs of concept are not funded. No scientific publications or equipment.
- The idea is that the funds can cover people’s time to develop software.
This might be relevant for your R package work so make sure to read the call, and good luck if you send a proposal! 🚀
To cache, or not to cache testthat results?
Have you ever wished you could cache testthat results? You’ll find arguments both in favor of and against that idea in this testthat issue – testthat maintainer Hadley Wickham being against the idea.
You might be interested in Kirill Müller’s experimental package lazytest that helps you rerun only the tests that have failed during the last run.
Check if an R package name is available
pak::pkg_name_check() by Gábor Csárdi can be viewed as a replacement for the available package. It has a very nice output.
(Also keep in mind that our pkgcheck::pkgcheck() function reports on potentially duplicated function names.)
What if your httptest mock files are suddenly ignored?
Imagine you’ve set up HTTP testing in your package with httptest and all goes well until one day, where the httptest mock files are ignored. Don’t panic! Check whether the calls that are mocked are still made with httr. Maybe one of your package’s dependencies upgraded their stack? If the calls are made with httr2, the tests need to be updated to httptest2 which thankfully isn’t too hard.
Updates to package checks
We added one new check this month to our pkgcheck system, specifically for statistics packages. Standards are expected to be documented with the srr package throughout the entire code of a package, including within all or most files in the /R and /tests directories. Having documentation distributed throughout code is particularly important to enable reviewers to judge compliance with standards at the relevant locations within the code. Packages which leave a large portion of standards documentation in a default location within a single file now produce an error when checked with pkgcheck, as well as with the srr function, srr_stats_pre_submit.
Thanks for reading! If you want to get involved with rOpenSci, check out our Contributing Guide that can help direct you to the right place, whether you want to make code contributions, non-code contributions, or contribute in other ways like sharing use cases.
If you haven’t subscribed to our newsletter yet, you can do so via a form. Until it’s time for our next newsletter, you can keep in touch with us via our website and Mastodon account.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.