January 2026 Top 40 New CRAN Packages

[This article was first published on R Works, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Two hundred forty-one of the new packages submitted to CRAN in January were still there in mid-February. Here are my Top 40 picks in nineteen categories: Artificial Intelligence, Computational Methods, Data, Dynamical Systems, Ecology, Economics, Epidemiology, Finance, Genetics, Genomics, High Performance Computing, Mathematics, Machine Learning, Medical Application, Networks, Statistics, Time Series, Utilities, and Visualization.

Artificial Intelligence

kuzco v0.1.0: Provides functions to make computer vision tasks approachable in R by leveraging Large Language Models, including fine-tuned prompts, boilerplate functions, and input/output helpers for common computer vision workflows, such as classifying and describing images. Functions are designed to take images as input and return structured data, helping users build practical applications with minimal code. There are four vignettes, including getting started and batch image processing.

A sad puppy start with kuzco.

Computational Methods

couplr v1.0.10: Functions designed for matching plots, sites, samples, or any pairwise optimization problem, solve optimal pairing and matching problems using linear assignment algorithms. Several algorithms are provided, including Hungarian method Kuhn (1955), the Jonker-Volgenant shortest path algorithm Jonker and Volgenant (1987), and the auction algorithm (Bertsekas (1988). There are six vignettes, including Quick Start and The Algorithm Collection.

Illustration of Hungarian method for optimal matching of two sets of points

hexify v0.3.10: Implements the ISEA discrete global grid system (Sahr, White and Kimerling (2003)). Includes a fast C++ core for projection and aperture quantization, and sf/terra-compatible R wrappers for grid generation and coordinate assignment. Output is compatible with dggridR for interoperability. There are three vignettes: Quick Start, Visualization, and Practical Workflows.

Map of Europe with hexagonal grid overlay

Data

dsBaseClient v6.3.5: Provides client-side base functions for the DataSHIELD software suite, which allows non-disclosive federated analysis on sensitive data. Functions have been designed to only share non-disclosive summary statistics, with built-in automated output checking based on statistical disclosure control with data sites setting the threshold values for the automated output checks. See README to get started.

mongolstats v0.1.1: Provides a tidyverse-friendly client for the National Statistics Office of Mongolia PXWeb API with helpers to discover tables, variables, and fetch statistical data. Also includes utilities to retrieve Mongolia administrative boundaries (ADM0-ADM2) as sf objects from open sources for mapping and spatial analysis. There are five vignettes, including Getting Started and Discovering Public Health Data.

read.abares v2.0.0: Provides functions to download and import agricultural data from the Australian Bureau of Agricultural and Resource Economics and Sciences ABARES and Australian Bureau of Statistics ABS. Data types serviced include spreadsheets, comma-separated value (CSV) files, geospatial data, including shape files and geotiffs, covering topics including broadacre crops, livestock, soil data, commodities, and more. See the vignettes, Setting Global Options and Working with spatial data.

Maps showing ABARES regions

Dynamical Systems

blvim v0.1.1: Provides functions to estimate Boltzmann–Lotka–Volterra (BLV) interaction model efficiently. Enables programmatic and graphical exploration of the solution space of BLV models when parameters are varied. See Wilson, A. (2008) for background and the vignettes: Systematic exploration of the BLV solution space and Theoretical Background.

Plots showing cluster variability

Ecology

mrangr v1.0.1: Implements a mechanistic and spatially explicit simulator of metacommunities and extends the rangr by adding the ability to simulate multiple species interacting through an asymmetric matrix of pairwise relationships, allowing users to model all types of biotic interactions — competitive, facilitative, or neutral — within spatially explicit virtual environments. See the vignette.

A sequence of plots that step through a species simulation

Epidemiology

rsurvstat v0.1.4 Provides an interface to the SurvStat web service from the Robert Koch Institute, allowing downloads of disease time series stratified by pathogen type and subtype, age, and geography from notifiable disease reports in Germany. See the vignette.

Plot of weekly incidence of Enterovirus

Economics

emburden v0.6.1: Provides functions to calculate and analyze household energy burden using the Net Energy Return aggregation methodology. Functions support weighted statistical calculations across geographic and demographic cohorts, with utilities for formatting results into publication-ready tables. Methods are based on Scheier & Kittner (2022). There are three vignettes including Getting Started and Temporal Energy Burden Analysis.

Finance

cre.dcf v0.0.3: Provides utilities to build unlevered and levered discounted cash flow tables for commercial real estate assets. Functions generate bullet and amortising debt schedules, compute credit metrics such as debt coverage ratios, debt service coverage ratios, interest coverage ratios, debt yield ratios, and forward loan-to-value ratios based on net operating income. The toolkit evaluates refinancing feasibility under alternative market scenarios and supports end-to-end scenario execution. There are nine vignettes including Getting Started and Investment styles panorama.

Plot of risk return by investment style

OptimalBinningWoE v1.0.8 Implements 36 high-performance binning algorithms for Weight of Evidence transformation in credit scoring and risk modeling, including advanced methods such as Mixed Integer Linear Programming, Genetic Algorithms, Simulated Annealing, and Monotonic Regression. Features automatic method selection based on information value maximization, strict monotonicity enforcement, and efficient handling of large datasets. Fully integrated with the tidymodels ecosystem for building robust machine learning pipelines. Based on methods described in Siddiqi (2006) and Navas-Palencia (2020). The vignette demonstrates practical applications using real-world credit data.

Plot of Scorecard ROC curve with optimal binning “}

Genetics

bifrost v0.1.3: Implements methods for detecting and visualizing cladogenic shifts in multivariate trait data on phylogenies. Implements penalized-likelihood multivariate generalized least squares models, enabling analyses of high-dimensional trait datasets and large trees. Includes a greedy step-wise shift-search algorithm following approaches developed in Smith et al. (2023), Berv et al. (2024), and methods described in Clavel et al. (2019). See the vignette.

Plot sgowing phylogenetic traitgram of the first principal component (PC1) of lower jaw shape in early osteichthyans (bony fishes)

visPedigree v1.0.1: Provides tools for tidying, analyzing, and visualizing animal pedigrees by modeling pedigrees as directed acyclic graphs. This ensures robust loop detection, efficient generation assignment, and optimal sub-population splitting. Key features include standardizing pedigree formats, flexible ancestry tracing, and generating legible vector-based PDF graphs. A unique compaction algorithm enables the visualization of massive pedigrees by grouping full-sib families. There are three vignettes, including How to draw a pedigree and Calculation and visualization of relationship matrix.

Pedigree plot

Genomics

OmicNetR v0.1.1: Provides an end-to-end workflow for integrative analysis of two omics layers using sparse canonical correlation analysis, including sample alignment, feature selection, network edge construction, and visualization of gene-metabolite relationships. The underlying methods are based on penalized matrix decomposition and sparse CCA Witten, Tibshirani and Hastie (2009) with design principles inspired by multivariate integrative frameworks such as mixOmics Rohart et al. (2017). See the vignette.

Example plot of bipartite omic data

High Performance Computing

futurize v0.1.0: Provides a straightforward path to scalable parallel computing via the future ecosystem. The futurize() function, which transpiles calls to sequential map-reduce functions, is combined with R’s native pipe operator to provide a way for speeding up iterative computations with minimal refactoring, e.g., lapply(xs, fcn) |> futurize(), purrr::map(xs, fcn) |> futurize(), and foreach::foreach(x = xs) %do% { fcn(x) } |> futurize(). Other map-reduce packages that can be “futurized” are BiocParallel, plyr, and crossmap. There is also support for a growing set of domain-specific packages, including boot, glmnet, mgcv, lme4, and tm. See README to get started. There are twelve vignettes, including Parallelize base-R apply functions and Parallelize purrr functions.

Mathematics

codyna v0.1.0: Perform analysis of complex dynamic systems with a focus on the temporal unfolding of patterns, changes, and state transitions in behavioral data. Supports both time series and sequence data and provides tools for the analysis and visualization of complexity, pattern identification, trends, regimes, sequence typology, as well as early warning signals. See the vignette.

Plot of time series with detected warnings

EmpiricalDynamics v0.1.2: Implements a comprehensive toolkit for discovering differential and difference equations from empirical time series data using symbolic regression. The package implements a complete workflow from data preprocessing, including Total Variation Regularized differentiation for noisy economic data, visual exploration of dynamical structure, and symbolic equation discovery via genetic algorithms. Functions leverage a high-performance Julia backend, SymbolicRegression.jl to provide industrial-grade robustness, physics-informed constraints, and rigorous out-of-sample validation. Designed for economists, physicists, and researchers studying dynamical systems from observational data. See the vignette.

Machine Learning

multiRL v0.2.3: Provides a general purpose toolbox for implementing Rescorla-Wagner models in multi-armed bandit tasks. As the successor and functional extension of the binaryRL package, multiRL modularizes the Markov Decision Process (MDP) into six core components that enable constructing custom models via intuitive if-else syntax and define latent learning rules for agents. See Wilson & Collins (2019) and look here for an overview.

rCISSVAE v0.0.4: Implements the clustering-Informed Shared-Structure Variational Autoencoder, a deep learning framework for missing data imputation introduced in Khadem Charvadeh et al. (2025. The model accommodates all three types of missing data mechanisms: Missing Completely At Random, Missing At Random, and Missing Not At Random. There are seven vignettes, including a quick start guide and Handling Binary and Categorical Variables.

slideimp v0.5.4: Provides fast k-nearest neighbors (K-NN) and principal component analysis (PCA) imputation algorithms for missing values in high-dimensional numeric matrices, i.e., epigenetic data. For extremely high-dimensional data with ordered features, a sliding window approach for K-NN or PCA imputation is provided. Additional features include group-wise imputation (e.g., by chromosome), hyperparameter tuning with repeated cross-validation, multi-core parallelization, and optional subset imputation. See Josse and Husson (2016) for background and the vignette for an example.

SportMiner v0.1.0: Provides a toolkit for mining, analyzing, and visualizing scientific literature in sport science and includes functions for retrieving abstracts from Scopus, preprocessing text data, performing advanced topic modeling using Latent Dirichlet Allocation, Structural Topic Models, and Correlated Topic Models, and for creating publication-ready visualizations, including keyword co-occurrence networks and topic trends. See Blei et al. (2003), Roberts et al. (2014), and Blei and Lafferty (2007) for background. There are two vignettes: Getting Started and Text Mining and Topic Modeling for Sport Science Literature.

xplainfi v1.0.0: Provides a consistent interface for common feature importance methods as described in Ewald et al. (2024), including permutation feature importance, conditional and relative feature importance, leave one covariate out, and Shapley additive global importance as well as feature sampling mechanisms to support conditional importance methods. See the vignettes Getting Started, Feature Samplers, and Simulation Settings.

DAG for mediated effects DGP

Medical Applications

autoFlagR v1.0.0: Provides automated data quality auditing using unsupervised machine learning and AI-driven anomaly detection for data quality assessment. Primarily designed for Electronic Health Records (EHR) data, with benchmarking capabilities for validation and publication. Methods based on Liu et al. (2008) and Breunig et al. (2000). There are three vignettes, including Getting Started and Healthcare Data Quality Example.

Distribution of Anomaly Scores

repfun v0.1.2: Provides functions to mimic the style of traditional reporting macros for clinical trials. The purpose is to generate tables, listings, and figures that support clinical research. This package is well-suited for firms or individuals who wish to incorporate R without changing their ways of working, as it follows a traditional clinical research workflow. Invoke functions (instead of macros) to summarize data and produce formatted reports. This package differs from others in that it includes tools (wrappers) for both analyzing and reporting data. There are twenty-seven vignettes, including Global Reporting Setup and SAS Type Variable Expansion.

R plot with clinical trial style footnotes and formatting

Networks

flownet v0.1.2: Provides high-performance tools for transport modeling: network processing, route enumeration, and traffic assignment. The package implements the Path-Sized Logit model for traffic assignment Ben-Akiva and Bierlaire (1999), an efficient route enumeration algorithm, and provides powerful utility functions for (multimodal) network generation, consolidation/contraction, and/or simplification. See the vignette.

Visualization of assigned flows in a network

Statistics

bayesDiagnostics v0.1.0: Provides comprehensive tools for Bayesian model diagnostics and comparison, including prior sensitivity analysis, posterior predictive checks Gelman et al. (2013), advanced model comparison using Pareto-smoothed importance sampling leave-one-out cross-validation Vehtari et al. (2017), convergence diagnostics, and prior elicitation tools. Integrates with brms, rstan, and rstanarm packages. See README to get started and the vignette for an introduction.

gradLasso v0.1.1: Implements LASSO regression using gradient descent with support for Gaussian, Binomial, Negative Binomial, and Zero-Inflated Negative Binomial (ZINB) families. Features cross-validation for determining lambda, stability selection, and bootstrapping for confidence intervals. Methods described in Tibshirani (1996) and Meinshausen and Buhlmann (2010). Look here for a quick start and see the vignette for an introduction.

Stabiity selection Plot

mfcurve v1.0.2: Implements multi-factor curve analysis for grouped data replicating and extending the functionality of the Stata mfcurve. See Krähmer (2023) and Simonsohn, Simmons, and Nelson (2020) for background. Functions for preprocessing, statistical testing, and visualization of results with confidence intervals are included. There is an Introduction.

Multi-factor curve analysis plot with confidence intervals

NMAR v0.1.2: Implements methods to estimate finite-population parameters under nonresponse that are not missing at random. Incorporates auxiliary information and user-specified response models, and supports independent samples and complex survey designs via objects from the survey package. See Qin, Leung and Shao (2002) and Riddles, Kim and Im (2016) for background. There are five vignettes including exptilt nonparam theory and Empirical Likelihood.

pmrm v0.0.2: A progression model for repeated measures is a continuous-time nonlinear mixed-effects model for longitudinal clinical trials in progressive diseases. Unlike mixed models for repeated measures which estimate treatment effects as linear combinations of additive effects on the outcome scale, PMRMs characterize treatment effects in terms of the underlying disease trajectory yielding clinically interpretable quantities. See Raket (2022) and Kristensen (2016) for background. There are three vignettes: Models, Usage and Validation.

Predictions by trial arm

RSTr v1.1.4: Implements a Gibbs Sampler for Poisson or Binomial discrete spatial data for a variety of Spatiotemporal Conditional Autoregressive (CAR) models. Includes measures to prevent estimate over-smoothing through a restriction of model informativeness for select models. Also provides tools to load output and get median estimates. Methods are from Besag, York, and Mollié (1991), Gelfand and Vounatsou (2003), Quick et al. (2017), and Quick et al. (2021). There are twelve vignettes, including an Introduction and the CAR Models.

uniLasso v2.11: Fits a univariate-guided sparse regression (lasso) by a two-stage procedure. The first stage fits p-separate univariate models to the response. The second stage gives more weight to the more important univariate features, and preserves their signs. It returns an objects that inherit from class glmnet. See Chatterjee, Hastie and Tibshirani (2025) for details.

Time Series

rjd3toolkit v3.6.0: Implements an R interface to JDemetra+ 3.x R ecosystem of time series analysis software which provides functions to create outlier regressors, define calendar regressors, fit Unobserved Components AutoRegressive Integrated Moving Average (UCARIMA) models, to test the presence of trading days or seasonal effects, and also to set specifications in pre-adjustment and benchmarking when using rjd3x13 or rjd3tramoseats. See the JDemetra link above for details.

Utilities

automerge v0.3.1: Provides R bindings to the Automerge Conflict-free Replicated Data Type (CRDT) library, which enables automatic merging of concurrent changes without conflicts, making it ideal for distributed systems, collaborative applications, and offline-first architectures. See Kleppmann et al. (2019) for background. There are five vignettes, including Getting Started and Understanding CRDTs in Automerge.

h5lite v2.0.0.2: Implements an interface for the Hierarchical Data Format 5 HDF5 library that bundles the necessary system libraries to ensure easy installation on all platforms. Features smart defaults that automatically map R objects (vectors, matrices, data frames) to efficient HDF5 types, removing the need to manage low-level details like data spaces or property lists. There are nine vignettes, including Getting Started and Parallel Processing.

softwareRisk v0.1.0: Provides functions that leverage the network-like architecture of scientific models together with software quality metrics to identify chains of function calls that are more prone to generating and propagating errors. Functions operate on tbl_graph objects representing call dependencies between functions (callers and callees) and compute risk scores for individual functions and for paths (sequences of function calls) based on cyclomatic complexity, in-degree, and betweenness centrality. Supports variance-based uncertainty and sensitivity analyses after Puy et al. (2022) to assess how risk scores change under alternative risk definitions. See the vignette.

Risk network for a Fortran model

Visualization

ggguides v1.1.4: Extends ggplot2 by providing one-liner functions for common legend and guide operations in ggplot2. Simplifies legend positioning, styling, wrapping, and collection across multi-panel plots created with patchwork or cowplot. There are five vignettes, including Getting Started and Styling & Customization.

Plot with rotated legend labels

gglycan v0.0.3: Extends ggplot2 to plot glycans following the symbol nomenclature for glycans using a standardized visual representation of glycan structures. See the vignette.

Sample glycal plot

ggsced v0.1.6: Extends ggplot2 to create publication-ready graphics with professional phase change lines, support for multiple baseline designs, and styling functions that follow Single-Case Experimental Design (SCED) visualization conventions. Key functions include adding phase change demarcation lines to existing plots and formatting axes with broken axis appearance, commonly used in single-case research. See the vignette.

Plot of Responding by Session faceted by for multiple conditions faceted by participant

ggskewboxplots V1.0.0: Extends ggplot2 for creating skewed boxplots using several statistical methods, including those of Kimber (1990), Hubert and Vandervieren (2008), Adil et al. (2015), Babura et al., (2017), and Walker et al. (2018). See the vignette.

Boxplot using the Walker method

vbracket v1.1.0: Extends ggplot2 by adding publication-quality custom legends with vertical brackets. Designed for displaying statistical comparisons between groups, commonly used in scientific publications for showing significance levels. Features include adaptive positioning, automatic bracket spacing for overlapping comparisons, font family inheritance, and support for asterisks, p-values, or custom labels. Look here for examples.

Plots with brackets and statistical annotations

v

To leave a comment for the author, please follow the link and comment on their blog: R Works.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)