Two hundred and twenty-nine new packages were submitted to CRAN in May. Here are my picks for the “Top 40”, organized into five categories: Data, Data Science and Machine Learning, Education, Miscellaneous, Statistics and Utilities.
datasauRushttps://CRAN.R-project.org/package=datasauRus v0.1.2: The Datasaurus Dozen is a set of datasets that have the same summary statistics, despite having radically different distributions. As well as being an engaging variant on the Anscombe’s Quartet, the data is generated in a novel way through a simulated annealing process. Look here for details, and in the vignette for examples.
HURDAT v0.1.0: Provides datasets from the Hurricane Research Division’s Hurricane Re-Analysis Project, giving details for most known hurricanes and tropical storms for the Atlantic and northeastern Pacific ocean (northwestern hemisphere). The vignette describes the datasets.
parlitools v0.0.4: Provides various tools for analyzing UK political data, including creating political cartograms and retrieving data. There is an Introduction, and vignettes on the British Election Study, Mapping Local Suthorities, and Using Cartograms.
suncalc v0.1: Implements an R interface to the ‘suncalc.js’ library, part of the SunCalc.net’s project for calculating sun position, sunlight phases, moon position and lunar phase for the given location and time.
Data Science and Machine Learning
openEBGM v0.1.0: Provides an implementation of DuMouchel’s Bayesian data mining method for the market basket problem. There is an Introduction, and vignettes for Processing Raw Data, Hyperparameter Estimation, Empirical Bayes Metrics, and Objects and Class Functions.
learnr v0.9: Provides functions to create interactive tutorials for learning about R and R packages using R Markdown, using a combination of narrative, figures, videos, exercises, and quizzes. Look here to get started.
olsrr v0.2.0: Provides tools for teaching and learning ordinary least squares regression. There is an Introduction and vignettes on Heteroscedascitity, Measures of Influence, Collinearity Diagnostics, Residual Diagnostics and Variable Selection Methods.
rODE v0.99.4: Contains functions to show students how an ODE solver is made and how classes can be effective for constructing equations that describe natural phenomena. Have a look at the free book Computer Simulations in Physics. There are several vignettes providing brief examples, including one on the Pendulum and another on Planets.
atlantistools v0.4.2: Provides access to the Atlantis framework for end-to-end marine ecosystem modelling. There is a package demo and vignettes for model preprocessing, model calibration, species calibration, and model comparison.
phylodyn v0.9.0: Provides statistical tools for reconstructing population size from genetic sequence data. There are several vignettes including a Coalescent simulation of genealogies and a case study using New York Influenza data.
RPEXE.RPEXT v0.0.1: Implements the likelihood ration test and backward elimination procedure for the reduced piecewise exponential survival analysis technique described in described in Han et al. 2012 and 2016. The vignette provides examples.
simglm v0.5.0: Provides functions to simulate linear and generalized linear models with up to three levels of nesting. There is an Introduction and vignettes for simulating GLMs and Missing Data performing Power Analysis and dealing with Unbalanced Data.
checkarg v0.1.0: Provides utility functions that allow checking the basic validity of a function argument or any other value, including generating an error and assigning a default in a single line of code.
desctable v0.1.0: Provides functions to create descriptive and comparative tables that are ready to be saved as csv, or piped to
pander::pander() to integrate into reports. There is a vignette to get you started.
lifelogr v0.1.0: Provides a framework for combining self-data from multiple sources, including fitbit and Apple Health. There is a general introduction as well as an introduction for visualization functions.
processx v2.0.0: Portable tools to run system processes in the background.
RHPCBenchmark v0.1.0: Provides microbenchmarks for determining the run-time performance of aspects of the R programming environment, and packages that are relevant to high-performance computation. There is an Introduction.
rlang v0.1.1: Provides a toolbox of functions for working with base types, core R features like the condition system, and core ‘Tidyverse’ features like tidy evaluation. The vignette explains R’s capabilities for creating Domain Specific Languages.
readtext v0.50: Provides functions for importing and handling text files and formatted text files with additional meta-data, including ‘.csv’, ‘.tab’, ‘.json’, ‘.xml’, ‘.pdf’, ‘.doc’, ‘.docx’, ‘.xls’, ‘.xlsx’ and other file types. There is a vignette
tangram v0.2.6: Provides an extensible formula system to implements a grammar of tables for creating production-quality tables using a three-step process that involves a formula parser, statistical content generation from data, and rendering. There is a vignette introducing the Grammar, a Global Style for Rmd, and duplicating SAS PROC Tabulate.
mbgraphic v1.0.0: Implements a two-step process for describing univariate and bivariate behavior similar to the cognostics measures proposed by Paul and John Tuke. First, measures describing variables are computed and then plots are selected. The vignette describes the details.