## Probability of winning a best-of-7 series

April 22, 2019
By
$Probability of winning a best-of-7 series$

The NBA playoffs are in full swing! A total of 16 teams are competing in a playoff-format competition, with the winner of each best-of-7 series moving on to the next round. In each matchup, two teams play 7 basketball games … Continue reading →

## Comparing Point-and-Click Front Ends for R

April 22, 2019
By

Now that I've completed seven detailed reviews of Graphical User Interfaces (GUIs) for R, let's try to compare them. It's easy enough to count their features and plot them,...

## Le Monde puzzle [#1094]

April 22, 2019
By

A rather blah number Le Monde mathematical puzzle: Find all integer multiples of 11111 with exactly one occurrence of each decimal digit.. Which I solved by brute force, by...

## Using R/exams for Written Exams in Finance Classes

April 22, 2019
By

Experiences with using R/exams for written exams in finance classes with a moderate number of students at Texas A&M International University (TAMIU). ...

## Practical Data Science with R Book Update (April 2019)

April 22, 2019
By

I thought I would give a personal update on our book: Practical Data Science with R 2nd edition; Zumel, Mount; Manning 2019. The second edition should be fully available...

April 22, 2019
By

Today I am happy to announce a new free course: Help Your Team Learn R! Over the last few years I’ve helped a number of data teams train their...

## India has 100k records on iNaturalist

April 21, 2019
By

Biodiversity citizen scientists use iNaturalist to post their observations with photographs. The observations are then curated there by crowd-sourcing the identifications and other trait related aspects too. The data...

## Reproducible Environments

April 21, 2019
By

Great data science work should be reproducible. The ability to repeat experiments is part of the foundation for all science, and reproducible work is also critical for business applications. Team collaboration,...

## survivalists [a Riddler’s riddle]

April 21, 2019
By
$survivalists [a Riddler’s riddle]$

A neat question from The Riddler on a multi-probability survival rate: Nine processes are running in a loop with fixed survivals rates .99,….,.91. What is the probability that the...

## Binning with Weights

April 21, 2019
By

After working on the MOB package, I received requests from multiple users if I can write a binning function that takes the weighting scheme into consideration. It is a...

## Familiarisation with the Australian Election Study by @ellis2013nz

April 21, 2019
By

The Australian Election Study is an impressive long term research project that has collected the attitudes and behaviours of a sample of individual voters after each Australian federal election...

## FizzBuzz in R and Python

April 21, 2019
By

In this post, we will solve a simple problem (called "FizzBuzz") that is asked by some employers in data scientist job interviews. The question seeks to ascertain the applicant's...

## Process Mining (Part 2/3): More on bupaR package

April 20, 2019
By

Recap In the last post, the discipline of event log and process mining were defined. The bupaR package was introduced as a technique to do process mining in R. Objectives for...

## Before you take my DataCamp course please consider this info

April 20, 2019
By

Today, I am finally getting around to writing this very sad blog post: Before you take my DataCamp course please consider the following information about the sexual harassment scandal...

## Batch Deployment of WoE Transformations

April 20, 2019
By

After wrapping up the function batch_woe() today with the purpose to allow users to apply WoE transformations to many independent variables simultaneously, I have completed the development of major...

## Styling DataTables

April 19, 2019
By

Most of the shiny apps have tables as the primary component. Now lets say you want to prettify your app and style the tables. All you need understand how...

## Quick Example of Latent Profile Analysis in R

April 19, 2019
By

Latent Profile Analysis (LPA) tries to identify clusters of individuals (i.e., latent profiles) based on responses to a series of continuous variables (i.e., indicators). LPA assumes that there are...

## Control Charts Another Package

April 19, 2019
By

I got an email from Alex Zanidean, who runs the xmrr package “You might enjoy my package xmrr for similar charts – but mine recalculate the bounds automatically” and if...

## Happy EasteR! Let’s find some eggs…

April 19, 2019
By

It's Easter Time! Let's find some eggs... Hi there! Yes, it's the most Easterful time of the year again. For some of us a sacret time, for others mainly an...

## ODSC East 2019 Talks to Expand and Apply R Skills

R programmers are not necessary data scientists, but rather software engineers. We have an entirely new multitrack focus area that helps engineers learn AI skills – AI for Engineers....

## tint 0.1.2: Some cleanups

April 19, 2019
By

A new version 0.1.2 of the tint package is arriving at CRAN as I write this. It follows the recent 0.1.1 release which included two fabulous new vignettes...

## Animating the US Treasury yield curve rates by @ellis2013nz

April 19, 2019
By

My eye was caught by this tweet by Robin Wigglesworth of the Financial Times with an Alan Smith animation of the US Treasury yield curve from 2005 to 2009....

## Generating the Ultimate List of 41 Data Science Podcasts by Crowdsourcing Google Results

April 18, 2019
By

Confession time: years ago, I was skeptical of podcasts. I was a music-only listener on commutes. Can you imagine? But around 2016, I gave in and finally took the...

## Using ecmwfr to measure global warming

April 18, 2019
By

For my research I needed to download gridded weather data from ERA-Interim, which is a big dataset generated by the ECMWF. Getting long term data through their website is...

April 18, 2019
By

Metadata are an essential part of a robust data science workflow ; they record the meaning of each variable : its units, quality, allowed range, how we collect it,...

## Base Rate Fallacy – or why No One is justified to believe that Jesus rose

April 18, 2019
By

In this post we are talking about one of the most unintuitive results of statistics: the so called false positive paradox which is an example of the so called...

## Applying gradient descent – primer / refresher

April 18, 2019
By

Every so often a problem arises where it’s appropriate to use gradient descent, and it’s fun (and / or easier) The post Applying gradient descent – primer / refresher...

## Common Uncommon Notations that Confuse New R Coders

April 17, 2019
By

Here are a few of the more commonly used notations found in R code and documentation that confuse coders of any skill level who are new to R. Be...

## A Comparative Review of the JASP Statistical Software

April 17, 2019
By

JASP is a free and open source statistics package that targets beginners looking to point-and-click their way through analyses. This article is one of a series of reviews which...