Don’t Make Arrogant Models

June 4, 2020
By
Don’t Make Arrogant Models

Arrogance is not a good quality for your models. It’s a rarely acknowledged fact that models data scientists produce are often not sufficiently robust or fault-tolerant to actually be put into production. Sure, you can trust your predictions when th...

Read more »

Extrapolating with B splines and GAMs

June 3, 2020
By
Extrapolating with B splines and GAMs

An issue that often crops up when modelling with generlaized additive models (GAMs), especially with time series or spatial data, is how to extrapolate beyond the range of the...

Read more »

littler 0.3.10: Some more updates

littler 0.3.10: Some more updates

The eleventh release of littler as a CRAN package is now available, following in the fourteen-ish year history as a package started by Jeff in 2006, and joined...

Read more »

Optimal workflows for package vignettes

June 2, 2020
By

Yet another post with a focus on package documentation! This time, we’ll cover vignettes a.k.a “long-form package documentation”, both basics around vignette building and infrastructure, and some tips for...

Read more »

Reproduce economic indicators from ‘The Economist’

June 2, 2020
By

Economic data (% change on year ago) Gross domestic product Industrial production Consumer prices Unemployment rate, % latest quarter* ...

Read more »

Extract or replace columns in a data frame using `$`

June 2, 2020
By
Extract or replace columns in a data frame using `$`

Columns in a data frame can be easily extracted and manipulated with the $ operator. Even new columns can be added by assigning a vector. Extract columns from a data...

Read more »

Estimating Time-varying Vector Autoregressive (VAR) Models

June 2, 2020
By
Estimating Time-varying Vector Autoregressive (VAR) Models

Models for individual subjects are becoming increasingly popular in psychological research. One reason is that it is difficult to make inferences from between-person data to within-person processes. Another is...

Read more »

Creating an hex map of France electricity consumption

June 2, 2020
By
Creating an hex map of France electricity consumption

The French Ministry for the Ecological and Inclusive Transition (for which I’m currently working) is ongoing a process of opening data related to energy consumption. Each year, we publish...

Read more »

Learning R: Build a Password Generator

June 2, 2020
By
Learning R: Build a Password Generator

It is not easy to create secure passwords. The best way is to let a computer do it by randomly combining lower- and upper-case letters, digits and other printable...

Read more »

Is Your Data Science Credible Enough?

June 1, 2020
By
Is Your Data Science Credible Enough?

Does Your Data Science Lack Credibility? In a recent post, we defined three key attributes of a concept we call Serious Data Science: Credibility, Agility and Durability. In this post,...

Read more »

Why R? Webinar – Understanding Word Embeddings

June 1, 2020
By
Why R? Webinar – Understanding Word Embeddings

June 4th (8:00pm UTC+2) will bring another fascinating Webinar at Why R? Foundation. We will have a presentation by Julia Silge about Understanding Word Embeddings. See you on the...

Read more »

{sergeant} 0.9.0 Is On Its Way to CRAN Mirrors!

June 1, 2020
By

Tis been a long time coming, but a minor change to default S3 parameters in tibbles finally caused a push of {sergeant} — the R package that lets you use...

Read more »

Linear and Logistic Regression in Practical Data Science with R 2nd Edition

June 1, 2020
By

One of the chapters that we are especially proud of in Practical Data Science with R is Chapter 7, “Linear and Logistic Regression.” We worked really hard to explain...

Read more »

minimax, maximin or plain

May 31, 2020
By
minimax, maximin or plain

A simple riddle from The Riddler on choosing between the maximum between two minima of two throws of an N-face dice, the minimum between two maxima of two throws...

Read more »

T^4 #4: Introducing Byobu

The next video (following the announcement, and shells sessions one, two, and three) is up in the T^4 series of video lightning talks with tips, tricks, tools, and...

Read more »

Effectively Deploying and Scaling Shiny Apps with ShinyProxy, Traefik and Docker Swarm

May 31, 2020
By
Effectively Deploying and Scaling Shiny Apps with ShinyProxy, Traefik and Docker Swarm

Table of Contents Introduction Docker Swarm vs standard Docker containers Docker Swarm vs Kubernetes Traefik vs Nginx Prerequisites Setting up Docker Swarm Setting up domains for your app and system dashboards Setting up Traefik stack Setting up ShinyProxy...

Read more »

Turkey vs. Germany: COVID-19

May 31, 2020
By
Turkey vs. Germany: COVID-19

In Turkey, some parts of society always compare Turkey to Germany and think that we are better than Germany for a lot of issues. The same applies to COVID-19...

Read more »

Riddler: Can You Roll The Perfect Bowl?

May 31, 2020
By
Riddler: Can You Roll The Perfect Bowl?

FiveThirtyEight’s Riddler Express link At the recent World Indoor Bowls Championships in Great Yarmouth, England, one of the rolls by Nick Brett went viral. Here it is in all its glory: 12/10 on the...

Read more »

gratia 0.4.1 released

May 31, 2020
By
gratia 0.4.1 released

After a slight snafu related to the 1.0.0 release of dplyr, a new version of gratia is out and available on CRAN. This release brings a number of new...

Read more »

RSqLParser – tool to parse your SQL queries.

May 31, 2020
By

A slow performing query is a ticking bomb which can lead to explosion i.e a huge performance overhead in your application, any time specially when there is load on...

Read more »

Learning Tfidf with Political Theorists

May 30, 2020
By
Learning Tfidf with Political Theorists

Thanks to Almog Simchon for insightful comments on a first draft of this post. Introduction Learning R for the past nine months or so has enabled me to explore new topics...

Read more »

Learning Shiny for Production

May 30, 2020
By

Hey Shiny devs of the world! I’m leading a training in July about building a Shiny application for production. It will be a 10 half-day session, with everything happening remotely, meaning...

Read more »

Superspreading and the Gini Coefficient

Superspreading and the Gini Coefficient

Abstract: We look at superspreading in infectious disease transmission from a statistical point of view. We characterise heterogeneity in the offspring distribution by the Gini coefficient instead of the usual...

Read more »

Mimic Excel’s Conditional Formatting in R

May 30, 2020
By
Mimic Excel’s Conditional Formatting in R

The DT package is an interface between R and the JavaScript DataTables library (RStudio DT documentation). In Example 3 (at this page) they show how to heatmap-format a table....

Read more »

drat 0.1.6: Rewritten macOS binary support

drat 0.1.6: Rewritten macOS binary support

A new version of drat arrived on CRAN overnight, once again taking advantage of the fully automated process available for such packages with few reverse depends and no...

Read more »

Don’t Feel Guilty About Selecting Variables

May 30, 2020
By

We have an exciting new article to share: Don’t Feel Guilty About Selecting Variables. If you are at all interested in the probabilistic justification of important data science techniques,...

Read more »

Charting the CMV Awareness Gap

May 29, 2020
By
Charting the CMV Awareness Gap

Sometimes it’s okay to use a secondary axis

Read more »

Two Different Methods to Apply Some Corey Hoffstein Analysis to your TAA

May 29, 2020
By

So, first off: I just finished a Thinkful data science in python bootcamp program that was supposed to take six … Continue reading →

Read more »

Syntax Highlighting in Blogdown; a very specific solution

Syntax Highlighting in Blogdown; a very specific solution

If you spend more than 5 seconds on this site you will be able to tell that it is not one of the snazziest ones around. This is mostly...

Read more »

Search R-bloggers

Sponsors