February 2020: “Top 40” New R Packages

March 25, 2020
By
February 2020: “Top 40” New R Packages

One hundred sixty-four new packages made it to CRAN in February. Here are my “Top 40” picks in eleven categories: Computational Methods, Data, Genomics, Machine Learning, Mathematics, Medicine, Science, Statistics, Time Series, Utilities, and Visualizations. Computational Methods delayed v0.3.0: Implements mechanisms to parallelize dependent tasks in a manner that optimizes the computational resources. Functions produce “delayed computations” which may be parallelized...

Read more »

Tuning random forest hyperparameters with #TidyTuesday trees data

March 25, 2020
By
Tuning random forest hyperparameters with #TidyTuesday trees data

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m using a #TidyTuesday dataset from earlier this year on trees around San Francisco to show how to tune the hyperparameters of a random forest model and then use the final best model. ...

Read more »

Probability and Bayesian modeling [book review]

March 25, 2020
By
Probability and Bayesian modeling [book review]

Probability and Bayesian modeling is a textbook by Jim Albert and Jingchen Hu that CRC Press sent me for review in CHANCE. (The book is also freely available in bookdown format.) The level of the textbook is definitely most introductory as it dedicates its first half on probability concepts (with no measure theory involved), meaning

Read more »

Beacons of Light…

March 25, 2020
By

Earlier today, DataIQ unveiled its list of the 100 most influential people in data-driven business, the DataIQ 100, and I’m... The post Beacons of Light… appeared first on Mango Solutions.

Read more »

Data Science Courses for Economists and Epidemiologists using RTutor

March 25, 2020
By
Data Science Courses for Economists and Epidemiologists using RTutor

Were there no Corona virus pandemic, German Universities would regularly start the summer semester in around a month (soon after Eastern). Now, it seems likely that courses will be offered digitally and students must learn from home. If you have a cou...

Read more »

How to Connect RStudio with Git (Github)

March 24, 2020
By

This video explains how to connect your RStudio with Git (Github) for a better R Programming / Software Development Workflow. It could be as big as updating a package file or as simple as managing a simple repo. This video also shows how can you clo...

Read more »

Shiny apps with math exercises

Shiny apps with math exercises

It is often very useful to practise mathematics by automatically generated exercises. One approach is multiple choice quizzes (MCQ), but it turns out to be fairly difficult to generate authentic wrong answers. Instead, we want the user to input the answer and be able to parse the answer and check whether this is the correct answer. There are many fun challenges in this, e.g. to...

Read more »

baRcodeR now on rOpenSci + online barcode PDF generation

March 24, 2020
By
baRcodeR now on rOpenSci + online barcode PDF generation

Some major changes have occurred to baRcodeR since the last post on version 0.1.2 to ease the process of making printable labels like below. As of baRcodeR 0.1.5: After extremely helpful reviews, baRcodeR was accepted as part of the rOpenSci project....

Read more »

New package RcppDate 0.0.1 now on CRAN!

March 24, 2020
By

A new small package with a new C++ header library is now on CRAN. It brings the date library by Howard Hinnant to R. This library has been in pretty wide-spread use for a while now, and adds to C++11/C++14/C++17 what will be (with minor modifications...

Read more »

rOpenSci community calls

March 24, 2020
By

This is a short PSA about an R resource that I recently learnt about (and participated in): rOpenSci community calls. According to the website, these community calls happen quarterly, and is a place where the public can learn about “best … Continue reading →

Read more »

Collider Bias, or: Are Hot Babes Dim and Eggheads Ugly?

March 24, 2020
By
Collider Bias, or: Are Hot Babes Dim and Eggheads Ugly?

Correlation and its associated challenges don’t lose their fascination: most people know that correlation doesn’t imply causation, not many people know that the opposite is also true (see: Causation doesn’t imply Correlation either) and some know that correlation can just be random (so-called spurious correlation). If you want to learn about a paradoxical effect nearly … Continue reading "Collider...

Read more »

What to study if you’re under quarantine

March 23, 2020
By
What to study if you’re under quarantine

If you’re staying indoors more often recently because of the current COVID-19 outbreak and looking for new things to study, here’s a few ideas! Free 365 Data Science Courses 365 Data Science is making all of their courses free until April 15. They have a variety of courses across R, Python, SQL, and more. Their The post What to...

Read more »

Comparing Machine Learning Algorithms for Predicting Clothing Classes: Part 4

March 23, 2020
By
Comparing Machine Learning Algorithms for Predicting Clothing Classes: Part 4

Florianne Verkroost is a Ph.D. candidate at Nuffield College at the University of Oxford. She has a passion for data science and a background in mathematics and econometrics. She applies her interdisciplinary knowledge to computationally address societal problems of inequality. This is the fourth and final post in a series devoted to comparing different machine learning methods for predicting clothing categories...

Read more »

Tidying the new Johns Hopkins Covid-19 time-series datasets

Tidying the new Johns Hopkins Covid-19 time-series datasets

Just hours after my old blog post about tidying Johns Hopkins CSSE Covid-19 data the team has changed their time-series table data structure. The data of the old post is still available but won’t be updated. This new blog post is based on the new times-series data structure. Currently, they only seem to provide time series data on confirmed...

Read more »

Le Monde puzzle [#1134]

March 23, 2020
By
Le Monde puzzle [#1134]

A weekly Monde current mathematical puzzle on gcd’s and scm’s: If one replaces a pair (a,b) of integers with the pair (g,s) of their greatest common denominator and smallest common multiple, how long at most before the sequence ends. Same question when considering a collection of five integers where two are selected by the pair

Read more »

Faster R package installation

March 23, 2020
By
Faster R package installation

Faster package installation Every few weeks or so, a tweet pops up asking about how to speed up package installation in R Depending on the luck of twitter, the author may get a few suggestions. The bigger picture is that package installation time is starting to become more of an issue for a number of The post Faster R package...

Read more »

Visualizing a Markov Chain

March 22, 2020
By
Visualizing a Markov Chain

A Markov Chain describes a sequence of states where the probability of transitioning from states depends only the current state. Markov chains are useful in a variety of computer science, mathematics, and probability contexts, also featuring prominently in Bayesian computation as Markov Chain Monte Carlo. Here, we’re going to look at a relatively simple breed of Markov chain and...

Read more »

Tidying the John Hopkins Covid-19 data

Tidying the John Hopkins Covid-19 data

My guess is that by now everybody knows that the public Github repository maintained by the Johns Hopkins University Center for Systems Science and Engineering has developed to a standard resource for individuals interested in analyzing the spread of SARS-CoV-2. There are alternative resources and blog articles covering them. Also, this blog post features a nice collection of R...

Read more »

How to create a simple Coronavirus dashboard specific to your country in R

March 22, 2020
By
How to create a simple Coronavirus dashboard specific to your country in R

Introduction Top R resources on Coronavirus Coronavirus dashboard: the case of Belgium How to create your own Coronavirus dashboard Additional notes Data Open source Accuracy Coronavirus dashboard: the case of Belgium Introduction The Novel COVID-19 Coronavirus is the hottest topic right now. Every day, the media and newspapers share the number of new cases and deaths in several countries, try to measure the impacts of the virus on citizens...

Read more »

Seed germination: fitting hydro-time models with R

Seed germination: fitting hydro-time models with R

I am locked at home, due to the COVID-19 emergency in Italy. Luckily I am healthy, but there is not much to do, inside. I thought it might be nice to spend some time to talk about seed germination models and the connections with survival analysis. We all know that seeds need water to germinate. Indeed, the absorption of water...

Read more »

Analyzing churn with chaid

Analyzing churn with chaid

This post tries to accomplish several things concisely. I’m making available a new function (chaid_table()) inside my own little CGPfunctions package, reviewing some graphing options and revisiting our old friend CHAID – Chi Squared \(\chi^2\) Automated Interaction Detection – to look at modeling a “real world” business problem. It’s based on a blog post from Learning Machines and investigates customer churn for a wireless provider. The original...

Read more »

Version Control is a Time Machine That Translates Common Hindsight Into Valuable Foresight

March 22, 2020
By
Version Control is a Time Machine That Translates Common Hindsight Into Valuable Foresight

For data science projects I recommend using source control or version control, and committing changes at a very fine level of granularity. This means checking in possibly broken code, and the possibly weak commit messages (so when working in a shared project, you may want a private branch or second source control repository). Please read … Continue reading Version...

Read more »

‘mustashe’ Explained

March 22, 2020
By

The purpose of the ‘mustashe’ R package is to save objects that result from some computation, then load the object from file the next time the computation is performed. In other words, the first time a chunk of code is evaluated, the output can be stashed for the next time the code chunk is run. This post explains how ‘mustashe’ works. See the previous...

Read more »

Infectious diseases and nonlinear differential equations

March 22, 2020
By
Infectious diseases and nonlinear differential equations

Last summer, I wrote about love affairs and linear differential equations. While the topic is cheerful, linear differential equations are severely limited in the types of behaviour they can model. In this blog post, which I spent writing in self-quarantine to prevent further spread of SARS-CoV-2 — take that, cheerfulness — I introduce nonlinear differential equations as a means...

Read more »

Tempered MCMC for Multimodal Posteriors

March 21, 2020
By
Tempered MCMC for Multimodal Posteriors

Previous Posts This is part of a sequence of posts chronicling my journey to manually implement as many MCMC samplers as I can from scratch. Code from previous psots can be found on GitHub. Also I tweet more than I should: StableMarkets. The Mult...

Read more »

Deploying RMarkdown Online

March 21, 2020
By
Deploying RMarkdown Online

RMarkdown is a great tool for creating a variety of documents with R code and it’s a natural choice for producing blog posts such as this one. However, depending on which blog software you use, you may run into some problems related to the file paths for figure images (such as ggplot charts) which will require tweaks in your...

Read more »

Covid 19 Tracking

March 21, 2020
By
Covid 19 Tracking

Get Your Epidemiology from Epidemiologists The COVID-19 pandemic continues to rage. I’m strongly committed to what should be the uncontroversial view that we should listen to the recommendations of those institutions and individuals with strong expertise in the relevant fields of Public Health, Epidemiology, Disease Control, and Infection Modeling. I also think that the open availability of data, and the...

Read more »

‘mustashe’

March 21, 2020
By

The purpose of the ‘mustashe’ R package is to save objects that result from some computation, then load the object from file the next time the computation is performed. In other words, the first time a chunk of code is evaluated, the output can be stashed for the next time the code chunk is run. ‘mustashe’ can be installed from CRAN or from GitHub. install.packages("mustashe") #...

Read more »

Setting up R with Visual Studio Code quickly and easily with the languageserversetup package

March 21, 2020
By
Setting up R with Visual Studio Code quickly and easily with the languageserversetup package

Introduction Over the past years, R has been gaining popularity, bringing to life new tools to with ith it. Thanks to the amazing work by contributors implementing the Language Server Protocol for R and writing Visual Studio Code Extensions for R, the most popular development environment amongst developers across the world now has very strong support for R as well. In...

Read more »

Search R-bloggers

Sponsors