June 2020: “Top 40” New CRAN Packages
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Two hundred ninety new packages made it to CRAN in June. Here are my “Top 40” picks in ten categories: Computational Methods, Data, Genomes, Machine Learning, Medicine, Science, Statistics, Time Series, Utilization, and Visualization.
Computational Methods
Rfractran v1.0 Implements the esoteric, Turing complete FRACTRAN programming language invented by John Horton Conway.
QGameTheory v0.1.2: Provides a general purpose toolbox for simulating quantum versions of game theoretic models See Flitney and Abbott (2002) for background. Models include the Penny Flip Game Meyer (1998), the Prisoner’s Dilemma Grabbe (2005), Two Person Duel Flitney and Abbott (2004), Battle of the Sexes Nawaz and Toor (2004), Hawk and Dove Game Nawaz and Toor (2010), Newcomb’s Paradox Piotrowski and Sladkowski (2002) and the Monty Hall Problem Flitney and Abbott (2002). Look here for and introduction to the package.
Data
covid19dbcand v0.1.0: Provides access seventy-five Drugbank data sets containing information about possible treatment for COVID-19.
tidytuesdayR v1.0.1: Provides functions for downloading the Tidy Tuesday data sets from R for Data Science Online Learning Community repository
us.census.geoheader v1.0.2: Implements a simple interface to the Geographic Header information from the 2010 US Census Summary File 2. See the vignette for details.
Genomics
dnapath v0.6.4: Provides functions to integrate pathway information into the differential network analysis of two gene expression datasets as described in Grimes et al. (2019). There is an Introduction and a vignette on Datasets.
TreeDist v1.1.1: Implements measures of tree similarity, including the information-based generalized Robinson-Foulds distances Smith, (2020), the Nye et al. (2006) metric and other additional metrics. There are several vignettes: Generalized Robinson-Foulds distances, Extending the Robinson-Foulds metric, Calculate tree similarity, Comparing splits using information theory, and Contextualizing tree distances.
volcano3D v1.0.1: Implements interactive plotting for three-way differential expression analysis which is useful for discovering quantitative changes in expression levels between experimental groups. See Lewis et al. (2019) for background and the vignette.
Machine Learning
boundingbox v1.0.1: Provides functions to generate bounding boxes for image classification. See Ibrahim et al. (2012) for background and the vignette for and introduction to the package.
corels v0.0.2: Implements the Certifiably Optimal RulE ListS (Corels)’ learner described in Angelino et al. (2017) which provides interpretable decision rules with an optimality guarantee. README contains an example.
nntrf v0.1.0: Implements non-linear dimension reduction by means of a neural network with hidden layers which can be useful as data pre-processing for machine learning methods that do not work well with many irrelevant or redundant features. See Rumelhart et al. (1986) for background and the vignette.
permimp v1.0-0: Implements an add-on to the party
package, with a faster implementation of the partial-conditional permutation importance for random forests. See Strobl et al. (2007) for background and the vignette for an introduction.
pluralize v0.2.0: Provides tools based on a JavaScript library to create plural, singular, and regular forms of English words along with tools to augment the built-in rules to fit specialized needs. See the vignette for examples.
triplot v1.3.0: Provides model agnostic tools for exploring effects of correlated features in predictive models and calculating the importance of the groups of explanatory variables. Biecek (2018) for details and look here for an example.
tfaddons v0.10.0: Provides and interface to TensorFlow Addons. See the vignette for an example.
Medicine
BayesianReasoning v0.3.2: Provides functions to plot and help understand positive and negative predictive values (PPV and NPV), and their relationship with sensitivity, specificity, and prevalence. See Akobeng (2007) for a theoretical overview and Navarrete et al. (2015) for a practical explanation. There is an Introduction and a vignette on Screening Tests.
riskCommunicator v0.1.0: Provides functions to estimate flexible epidemiological effect measures including both differences and ratios using the parametric G-formula. See Robbins (1986) and Ahern et al. (2009) for background. There is an Introduction and a vignette for newbie R users.
Science
actel v1.0.0: Designed for studies where fish tagged with acoustic tags are expected to move through receiver arrays, this package combines the advantages of automatic sorting and checking of fish movements with the possibility for user intervention on tags that deviate from expected behavior. Calculations are based on Perry et al. (2012). There are an astounding seventeen vignettes including: Preparing your data, Structruing the Study Area, and Explore.
safedata v1.0.5: Provides access to data from the SAFE Project, a large scale ecological experiment in Malaysian Borneo that explores the impact of habitat fragmentation and conversion on ecosystem function and services. There is an Overview and an Introduction.
Statistics
causact v0.3.2: Built on greta
and TensorFlow
, this package enables users to define probabilistic models using directed acyclic graphs. See README for examples.
frechet v0.1.0: Provides implementation of statistical methods for random objects lying in various metric spaces, which are not necessarily linear spaces including Fréchet regression for random objects with Euclidean predictors. See Petersen and Müller (2019) for the theory.
hmclearn v0.0.3: Provide a framework for learning the intricacies of the Hamiltonian Monte Carlo. See Michael (2017) and Thomas and Tu (2020) for background. There are vignettes on Linear mixed effects, linear regression, logistic mixed effects, logistic regression, and poisson regrression.
mashr v0.2.38: Implements the multivariate adaptive shrinkage (mash) method of Urbut et al. (2019) for estimating and testing large numbers of effects in many conditions (or many outcomes) There is an Introduction and vignettes on eQTL studies, Correlations, Covariances, Common Baseline, No Common Baseline, Sampling from Posteriors, and Simulation.
molic v2.0.1: Implements the method of Lindskou et al. (2019) to detect outliers in high dimensional, categorical data. There are vignettes on the Outlier Model, Detecting Skin Diseases, and Genetic Data.
multinma v0.1.3: Uses Stan
to fit network meta-analysis and network meta-regression models for aggregate data, individual patient data, and mixtures of both. See Phillippo et al. (2020) for background and the vignettes for examples:
Stroke prevention, BCG Vaccine for Tuberculosis, Beta Blockers, Diabetes, Dietary Fat, Parkinson’s disease, Plaque Psoriasis, Smoking Cessation, Statins, Thrombolytic Treatments, and neutropenia or neutrophil dysfunction.
SCOUTer v1.0.0: Offers a new approach to simulating outliers by generating new observations defined by the statistics: Squared Prediction Error (SPE) and Hotelling’s $T^{2}$ statistic. See the vignette.
Time Series
bootUR v0.1.0: Provides functions to perform various bootstrap unit root tests for both individual time series (including augmented Dickey-Fuller test and union tests), multiple time series and panel data. See Palm et al. (2008) for background, and the vignette for an introduction and extensive references.
ChangePointTaylor v0.1.0: Implements the change in mean detection method described in Taylor (2000). See the vignette.
LOPART Implements the change point detection algorithm described in Hocking and Srivastava (2020).
modeltime v0.0.2: Implements a time series forecasting framework for use with the tidymodels
ecosystem, and ARIMA, Exponential Smoothing, and time series models from the forecast
and prophet
See Forecasting Principles & Practice, and Prophet: forecasting at scale for background. These is a Getting Started Guide and vignettes describing Extension and the Model List.
Utilities
knitrdata v0.5.0: Implements a data language engine for incorporating data directly in ‘rmarkdown’ documents so that they can be made completely standalone. See the vignette for details.
lazyarray v1.1.0: Implements multi-threaded, serialized compressed arrays that fully utilizes modern solid state drives that allow users to quickly store large data while using limited memory. A lazy-array can be shared across multiple R sessions and multiple R sessions can simultaneously write to a same array. For more information, look here.
officedown v0.2.0: Provides functions to produce Microsoft Word documents from R Markdown. There are vignettes on Captions and References, Lists, officer
Support, Data Frame Printing, and YAML Headers.
r2dictionary v0.1: Allows users to directly search for definitions of terms from within the R environment. The source dictionary is an original work of The Online Plain Text English Dictionary (OPTED). See the vignette.
rmdpartials v0.5.8: Enables the use of rmarkdown
partials (knitr
child documents) for making components of HTML, PDF and Word documents. See the vignette to get started.
tidycat v0.1.1: Provides functions to create additional rows and columns on broom::tidy()
output to allow for easier control on categorical parameter estimates. The vignette contains examples
Visualization
ggdist v2.2.0: Provides primitives for visualizing distributions using ggplot2
that are tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Primitives include points with multiple uncertainty intervals, eye plots Spiegelhalter (1999), density plots, gradient plots, dot plots Wilkinson (1999), quantile dot plots Kay et al. (2016),, complementary cumulative distribution function barplots, Fernandes et al. (2018), and fit curves with multiple uncertainty ribbons.
loon.ggplot v1.0.1: Provides a bridge between the loon
and ggplot2
packages. Users can turn static ggplot2
plots into interactive loon
plots and vice versa. There are vignettes on ggplots -> loon, loon -> ggplots, and on u sing pipes.
treeheatr v0.1.0: Provides interpretable decision tree visualizations with the data represented as a heatmap at the tree’s leaf nodes. There is a vignette.
tilemaps v0.2.0: Implements the algorithm of McNeill and Hale (2017) for generating tilemaps. See the vignette.
wrGraph v1.0.2: Provides enhancements to base R graphics for plotting high-throughput data including automatic segmenting of the current device (e.g. window) to accommodate multiple new plots, automatic checking for optimal location of legends in plots, small histograms inserted as legends, the generation of mouse-over interactive html pages and more. See the vignette.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.