## Modeling pandemics (1)

March 19, 2020
The most popular model to model epidemics is the so-called SIR model – or Kermack-McKendrick. Consider a population of size , and assume that is the number of susceptible, the number of infectious, and for the number recovered (or immune) individuals, so that which implies that . In order to be more realistic, consider some (constant) birth rate ,...

## Apply for the Google Summer of Code and Help Us Improving The R Code Optimizer

March 19, 2020
Are you a BSc, MSc or PhD student, and this summer (winter down south) you would like to contribute to open-source while earning some cash? Then you will be interested to know that with @CancuCS this year we are going to mentor for the Google Summer of...

## The significance of population size, year, and per cent women on the education level in Sweden

In twelve posts I have analysed how different factors are related to salaries in Sweden with data from Statistics Sweden. In this post, I will analyse a new dataset from Statistics Sweden, population by region, age, level of education, sex and year. Not knowing exactly what to find I will use a criterion-based procedure to find the model that...

## Rebalancing history

March 19, 2020
Our last post on rebalancing struck an equivocal note. We ran a thousand simulations using historical averages across different rebalancing regimes to test whether rebalancing produced better absolute or risk-adjusted returns. The results suggested it did not. But we noted many problems with the tests—namely, unrealistic return distributions and correlation scenarios. We argued that if we used actual historical...

## On model specification, identification, degrees of freedom and regularization

March 19, 2020
## RProtoBuf 0.4.16: Now with JSON

March 19, 2020
A new release 0.4.16 of RProtoBuf is now on CRAN. RProtoBuf provides R with bindings for the Google Protocol Buffers (“ProtoBuf”) data encoding and serialization library used and released by Google, and deployed very widely in numerous projects a...

## Shiny: Performance tuning with future & promises – Part 1

March 19, 2020
In our previous article about Shiny we shared our experiences with load testing and horizontal scaling of apps. We showed the design of a process from a proof of concept to a company-wide application. The second part of the blog series focuses on the R packages future & promises, which are used for optimizations within

## Why Is It Called That Way?! – Origin and Meaning of R Package Names

March 19, 2020
Function names should correspond to what they do. But have you ever wondered why whole packages are called what they are called? In this blog post Matthias leads you through the mysterious world of R package names. Der Beitrag Why Is It Called That Way?! – Origin and Meaning of R Package Names erschien zuerst auf STATWORX.

## Extended floating point precision in R with Rmpfr

March 18, 2020
$Extended floating point precision in R with Rmpfr$

I learnt from a recent post on John Cook’s excellent blog that it’s really easy to do extended floating point computations in R using the Rmpfr package. Rmpfr is R’s wrapper around the C library MPFR, which stands for “Multiple … Continue reading →

## COVID-19 Tracker: Days since N

March 18, 2020
There’s no shortage of dashboards and data-visualizations covering some aspect of the ongoing coronavirus pandemic, but not having come across a tool that allowed me to easily compare countries myself, I developed this COVID-19 Tracker shiny app both...

## parzer: Parse Messy Geographic Coordinates

parzer is a new package for handling messy geographic coordinates. The first version is now on CRAN, with binaries coming soon hopefully (see note about installation below). The package recently completed rOpenSci review. parzer motivation The idea for this package started with a tweet from Noam Ross (https://twitter.com/noamross/status/1070733367522590721) about 15 months ago. The idea being that sometimes you have geographic coordinates in a messy format, or in many different formats,...

## Simulating COVID-19 interventions with R

March 18, 2020
Tim Churches is a Senior Research Fellow at the UNSW Medicine South Western Sydney Clinical School at Liverpool Hospital, and a health data scientist at the Ingham Institute for Applied Medical Research. This post examines simulation of COVID-19 spread using R, and how such simulations can be used to understand the effects of various public health interventions design to...

## How to do a t-test or ANOVA for many variables at once in R and communicate the results in a better way

March 18, 2020
Introduction Perform multiple tests at once Concise and easily interpretable results T-test ANOVA To go even further Photo by Teemu Paananen Introduction As part of my teaching assistant position in a Belgian university, students often ask me for some help in their statistical analyses for their master’s thesis. A frequent question is how to compare groups of patients in terms of several quantitative continuous variables. Most of us...

## nnlib2Rcpp: a(nother) R package for Neural Networks

March 18, 2020
For anyone interested, nnlib2Rcpp is an R package containing a number of Neural Network implementations and is available on GitHub. It can be installed as follows (the usual way for packages on GitHub): library(devtools) install_github("VNNikolaidis/nnlib2Rcpp") The NNs are implemented in C++ (using  nnlib2 C++ class library) and are interfaced with R via Rcpp package (which … Continue reading nnlib2Rcpp:...

## RcppCCTZ 0.2.7

March 18, 2020
A new release 0.2.7 of RcppCCTZ is now at CRAN. RcppCCTZ uses Rcpp to bring CCTZ to R. CCTZ is a C++ library for translating between absolute and civil times using the rules of a time zone. In fact, it is two libraries. One for dealing with civil tim...

## Community of Bioinformatics Software Developers (CDSB): The story of a diversity and outreach hotspot in Mexico that hopes to empower local R developers

March 18, 2020
By Leonardo Collado Torres, Ph. D., Research Scientist, Lieber Institute for Brain Development, Brain genomics #rstats coder working w/ @andrewejaffe @LieberInstitute. @lcgunam @jhubiostat @jtleek alumni. @LIBDrstats @CDSBMexico co-founder I have... The post Community of Bioinformatics Software Developers (CDSB): The story of a diversity and outreach hotspot in Mexico that hopes to empower local R developers appeared first on R Consortium.

## Time Series Machine Learning (and Feature Engineering) in R

March 18, 2020
Machine learning is a powerful way to analyze Time Series. With innovations in the tidyverse modeling infrastructure (tidymodels), we now have a common set of packages to perform machine learning in R. These packages include parsnip, recipes, tune, and...

## Survey Results: What Degree is Best for Data Science?

March 17, 2020
The Survey The survey What Degree is Best for Data Science? ran from  February 9 through March 12, 2020 asking participants 4 questions: Answers about self: Q1: What is the highest level of school degree you have completed? Q2: Which of the following best describes the field in which you received your highest degree?  Answers about best education: Q3: What level of...

## A noisy start

March 17, 2020
I was sure I had released this… Honestly, I thought the new version of ambient had landed on CRAN a year ago. What does that say about me as a developer? Probably not something very positive. One reason is probably that ambient is one of my smaller packages mostly made for myself. It generates noise patterns which is something I use extensively in...

## Shiny Contest 2020 deadline extended

March 17, 2020
The original deadline for Shiny Contest 2020 was this week, but given that many of us have had lots of unexpected changes to our schedules over the last week due to the COVID-19 outbreak, we have decided to extend the deadline by two weeks. If you’ve been planning to submit an entry for the contest this week (and if...

## The ulimate package for correlations (by easystats)

March 17, 2020
The correlation package The easystats project continues to grow with its more recent addition, a package devoted to correlations. Check-out its webpage here! It’s lightweight, easy to use, and allows for the computation of many different kinds of correlations, such as partial correlations, Bayesian correlations, multilevel correlations, polychoric correlations, biweight, percentage bend or Sheperd’s Pi correlations (types of robust correlation), distance...

## paletteer: Hundreds of color palettes in R

March 17, 2020
Looking for just the right colors for your data visualization? I often cover tools to pick color palettes on my website (e.g. here, here, or here) and also host a comprehensive list of color packages in my R programming resources overview. However, paletteer is by far my favorite package for customizing your colors in R! … Continue reading paletteer:...

## Rcpp 1.0.4: Lots of goodies

March 17, 2020
The fourth maintenance release 1.0.4 of Rcpp, following up on the 10th anniversary and the 1.0.0. release sixteen months ago, arrived on CRAN this morning. This follows a few days of gestation at CRAN. To help during the wait we provided this releas...

## COVID-19: The Case of Germany

March 17, 2020
It is such a beautiful day outside, lot’s of sunshine, spring at last… and we are now basically all grounded and sitting here, waiting to get sick. So, why not a post from the new epicentre of the global COVID-19 pandemic, Central Europe, more exactly where I live: Germany?! Indeed, if you want to find … Continue reading "COVID-19:...

## Generate names using posterior probabilities

March 17, 2020
If you are building synthetic data and need to generate people names, this article will be a helpful guide. This article is part of a series of articles regarding the R package conjurer. You can find the first part of this series here. Steps to generate people names 1. Installation Install conjurer package by using … Continue reading Generate...

## How to create decorators in R

March 16, 2020
Introduction One of the coolest features of Python is its nice ability to create decorators. In short, decorators allow us to modify how a function behaves without changing the function’s source code. This can often make code cleaner and easier to modify. For instance, decorators are also really useful if you have a collection of The post How to...

## R spatial follows GDAL and PROJ development

GDAL and PROJ gdalbarn crs objects in sf CRS objects in sp Coercion from CRS objects to crs and back Axis order Further reading GDAL and PROJ GDAL and PROJ (formerly proj.4) are two libraries that form the basis, if ...

## LASSO regression using tidymodels and #TidyTuesday data for The Office

March 16, 2020
I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m using this week’s #TidyTuesday dataset on The Office to show how to build a LASSO regression model and choose regularization parameters! Here is the code I used in the video, for those...