There were so many interesting ideas among the 222 new packages that made it to CRAN in September that I found it exceptionally difficult to decide on the “Top 40” packages. In the end, I only managed to limit my selection to 40 by avoiding all packages that I would normally classify under “Data”: packages that are primarily intended to provide access to some data source. I hope to make up for this by providing a list of data packages sometime soon.
Below are my picks for September’s Top 40 in six categories: Computational Methods, Machine Learning, Science, Statistics, Utilities, and Visualizations.
Rlinsolve v0.1.1: Implements iterative solvers for sparse linear systems of equations, including basic stationary iterative solvers using Jacobi, Gauss-Seidel, Successive Over-Relaxation and SSOR methods and non-stationary, Krylov subspace methods. There is a vignette to get started. Detailed descriptions may be found in the SIAM book.
sdpt3r v0.1: Implements the SDPT3 method of Toh, Todd, and Tutuncu to solve Semi-Definite Linear Programming problems. There are several vignettes illustrating the use of the package in various applications, including D-Optimal Experimental Design and Distance Weighted Discrimination.
VeryLargeIntegers v0.1.4: Provides tools to work with arbitrarily large integers without loss of precision.
bnclassify v0.3.3: Implements algorithms for learning discrete Bayesian network classifiers from data, including a number of those described in Bielza & Larranaga. There is an Introduction and vignettes giving Runtime Information and additional Technical Information.
DMRnet v0.1.0: Provides model selection algorithms for regression and classification, where the predictors can be numerical and categorical and the number of regressors exceeds the number of observations. See the papers by Maj-Kańska et al. and Pokarowski and Mielniczuk for the mathematical details.
FSelectorRcpp v0.1.8: provides an
Rcpp-based implementation of
FSelector entropy-based feature selection algorithms based on an Multi-Interval Discretization with a sparse matrix support. There are vignettes on Getting Started and Benchmarks.
googleLanguageR v0.1.0: Provides an interface to Google Cloud machine-learning APIs for text and speech tasks. Call the Cloud Translation API for detection and translation of text, the Natural Language API to analyse text for sentiment, entities or syntax, and the Cloud Speech API to transcribe sound files to text. There is an Introduction and vignettes for the NLP, Speech, and Translation APIs.
leabRa v0.1.0: Implements the Leabra (local, error-driven and associative, biologically realistic algorithm) that allows for the construction of artificial neural networks that are biologically realistic, and balances supervised and unsupervised learning within a single framework. See the vignette to get started and look here for details.
lime v0.3.0: Is a port of the Python package, which attempts to explain the outcome of black-box models by fitting local models around the points of interest. Look here for details. There is a vignette to get you started.
udpipe v0.1.1: Provides a natural-language-processing toolkit for tokenization, parts-of-speech tagging, lemmatization, and dependency parsing of raw text. For details, see this paper and the vignettes on Annotating Text and Model Building.
afpt v1.0.0: Implements the aerodynamic power model described in Klein Heerenbrink et al., and allows estimation and modelling of flight costs in vertebrate animal flight. There are vignettes on Basic Usage, the underlying Aerodynamic Model, and Multiple Birds.
cr17 v0.1.0: Provides tools for analyzing competing-risks models, including testing differences between groups (Gray and Fine and Gray) and visualizations of survival and cumulative incidence curves. The vignette gives examples.
fdAnova v0.1.0: Provides functions to perform analysis of variance testing procedures for univariate and multivariate functional data. See Cuesta-Albertos and Febrero-Bande. There is a comprehensive vignette.
geex v1.0.3: Provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations. See M-estimation as in Stefanski & Boos. There is an Introduction, as well as vignettes on M-estimation, Custom root solvers, Parameter Estimation, Software Design, and more.
mosaicModel v0.3.0: Provides functions for evaluating, displaying, and interpreting statistical models with the goal of abstracting the operations on models from the particular architecture of the model. The vignette shows how to use the package.
odr v0.3.2: Provides methods for calculating the optimal sample allocation that minimizes variance of treatment effects in a multilevel randomized trial under fixed budget and cost structure, and for performing power analyses with and without accommodating costs and budget. There is a vignette.
powerlmm v0.1.0: Implements both analytical and simulation methods to calculate power for two- and three-level multilevel longitudinal studies with missing data. The analytical calculations extends the method described in Galbraith et al. to three-level models. There are tutorials on Model Evaluation via Monte Carclo Simulation, Two-level Longitudinal Power Analysis, Three-level Longitudinal Power Analysis, and a vignette on the Details of Power Calculations.
randnet v0.1: Facilitates model-selection and parameter-tuning procedures for a class of random network models. Model selection can be done by a general cross-validation framework called ECV, NCV, a likelihood ratio method, and spectral methods.
tscount v1.4.0: Implements likelihood-based methods for model fitting and assessment, prediction, and intervention analysis of count time series following generalized linear models. The vignette provides the details.
basictabler v0.1.0: Provides functions to create tables from data frames and matrices, manipulate tables row-by-row, column-by-column or cell-by-cell, and then publish them using
HTML widgets or
Excel. There is an Introduction and vignettes on Working with Cells, Outputs, Styling, Formatting, Shiny, and Excel.
bigstatsr v0.2.2: Uses file-backed matrices to provide scalable statistical tools.
keyring v1.0.0: Provides a platform-independent API to access the operating system’s credential store. It currently supports:
macOS, The Credential Store on
Windows, the Secret Service API on
Linux, and a simple, platform-independent store implemented with environment variables.
pinp v0.0.2: Offers a
PNAS-like style for
rmarkdown derived from the Proceedings of the National Academy of Sciences of the United States of America. The vignette shows how to get started.
re2r v0.2.0: Provides an interface to Google’s deterministic finite-automaton-based regular expression engine that is very fast at matching large amounts of text. There is an Introduction and a vignette on Syntax.
spiderbar v0.2.0: Provides a wrapper for the rep-cpp C++ library for processing
robots.txt files in accordance with the The Robots Exclusion Protocol, a set of standards for allowing or excluding robot/spider crawling of different areas of site content. Look in the README for an example of how to use the package.
tibbletime v0.0.2: Is an extension of the
tibble package that allows for the creation of time-aware tibbles. Some immediate advantages include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and calling functions similar in spirit to the
map family from
purrr on time-based tibbles. There is an Introduction and vignettes on Time-based Filtering, Changing Periodicity, and Rolling Calculaions.
egg v0.2.0: Provides miscellaneous functions to customize
ggplot2 plots, including high-level functions to post-process layouts and allow alignment between plot panels, as well as setting panel sizes to fixed values. There is an Overview and a vignette for laying out multiple plots on a page.