August was a relatively slow month for new R packages; “only” 180 new packages stuck to CRAN. Here are my “Top 40” picks organized into seven categories: Data, Machine Learning, Miscellaneous, Science, Statistics, Utilities and Visualizations. Although they have been written for specialized audiences, I have included the three “Science” packages because, in my layman’s opinion, they not only seem to be useful, but they are each documented well enough to give an interested person some idea of what they do.
edgarWebR v0.1.1: Provides methods to access and parse live filing information from the U.S. Securities and Exchange Commission, including company and fund filings, along with all associated metadata. See the vignette for an introduction.
forwards v0.1.0: Anonymized data from surveys conducted by Forwards, the R Foundation task force on women and other under-represented groups. Currently, a single data set of responses to a survey of attendees at useR! 2016. The vignette provides an overview.
Rnightlights v0.1.2: Provides an interface to extract raster and zonal statistics from satellite nightlight rasters, downloaded from the United States National Oceanic and Atmospheric Administration free data repositories.
Knoema v0.1.7: Provides an API interface to Knoema, one of the largest collections of public data and statistics on the Internet, featuring about 2.5 billion time series from thousands of sources. The README file will get you started.
vegetable v0.1.0: Provides functions to import and manipulate data from vegetation-plot databases, especially data stored in Turboveg. The package also implements import/export routines for exchanging data with Juice.
partitionComparison v0.2.2: Provides several measures (dissimilarity, distance/metric, correlation, entropy) for comparing two partitions of the same set of objects. See the paper by Marina Meilă for details.
LearnGeom v1.0: Provides functions for learning and teaching basic plane Geometry at the undergraduate level, with the aim of being helpful to young students with few programming skills. The vignette offers several examples.
PGRdup v0.2.3.2: Provides functions to aid the identification of probable/possible duplicates in Plant Genetic Resources collections using ‘passport databases’ comprising information records from each constituent sample. The vignette provides an overview.
snpReady v0.9.3: Provides functions to clean, summarize and prepare genomic data sets to Genome Selection and Genome Association analysis and to estimate population genetic parameters. See the vignette for details.
blink v0.1.0: Implements the model in Steorts, which performs Bayesian entity resolution for categorical and text data, for any distance function defined by the user. Reproducible experiments are illustrated in the vignette.
cholera v0.2.1: Amends errors, augments data and aids analysis of John Snow’s map of the 1854 London cholera outbreak. The original data come from Rusty Dodson and Waldo Tobler’s 1992 digitization of Snow’s map. Those data are no longer available. However, they are preserved in the HistData package. There are vignettes on Missing Data, Pump Neighborhoods, Roads, Time Series, and “Unstacking bars”.
drtmle v1.0.0: Provides targeted minimum loss-based estimators for counter-factual means and causal effects that are doubly robust with respect both to consistency and asymptotic normality van der Laan. The extensive vignette does the math.
esvis v0.1.0: Provides a variety of methods to estimate and visualize distributional differences in terms of effect sizes, with emphasis on evaluating differences between two or more distributions across the entire scale, rather than at a single point (e.g., differences in means). Look here for an example.
fuser v1.0.0: Provides functions for high-dimensional penalized regression across heterogeneous subgroups. The underlying model is described in detail in Dondelinger and Mukherjee. The vignette shows how to use the package for prediction over subgroups.
gamlss.spatial v1.3.4: Provides functions to fit Gaussian Markov Random Fields within the Generalized Additive Models for Location Scale and Shape algorithms. The vignette introduces the package and provides several examples.
INLAutils v0.0.4: Provides a number of utility functions for solving models using the Integrated Nested Laplace Approximation INLA, a new approach to statistical inference with latent Gaussian Markov random fields (GMRF). Look here for examples and plots.
missRanger v1.0.0: Provides an implementation of the
MissForest algorithm for imputing mixed-type data sets by chaining tree ensembles that was introduced by Stekhoven and Buehlmann. Look here for an example.
naniar v0.1.0: Provides data structures and functions that facilitate the plotting of missing values and examination of imputations. There is a Getting Started Guide and a Gallery of Missing Data Visualizations.
powdist v0.1.3: Provides density, distribution, and quantile functions, as well as a function for random draws from power and reversal power distributions.
RATest v0.1.0: Provides a collection of randomization tests, data sets, and examples currently focusing on permutation tests for baseline covariates in the sharp regression discontinuity design. See Canay and Kamat and the vignette.
blastula v0.1: Allows users to compose and send HTML email messages that render across a range of email clients and device sizes. Messages are composed using Markdown and a text interpolation system that allows for the injection of evaluated R code within the message body. The README file describes how to use the package.
pointblank v0.1: Provides functions to validate data in local data frames, local
tibble objects, in
tsv files, and in
MySQL database tables. Look at the README file for an example.
reqres v0.2.0: Provides functions to facilitate parsing of HTTP requests, creation of appropriate responses, and handling of the housekeeping involved in working with HTTP exchanges. See README to get started.
spelling v1.0: Provides spell checking for common document formats including latex, markdown, manual pages, and description files.
billboarder v0.0.3: Provides an
gggenes v0.2.0: Provides a
ggplot2 geom and helper functions for drawing gene arrow maps.