Articles by That’s so Random

The Psychology of Flame Wars

June 26, 2019 | That’s so Random

I have been meaning to write this for a while, but with the dplyr vs data.table feud rising to new levels on Twitter the last couple of days, it all of a sudden seems more relevant. For those who don’t know what I am talking about, there are ... [Read more...]

padr is updated

June 12, 2019 | That’s so Random

Yesterday v.0.5.0 of the padr package hit CRAN. You will find the main changes in the thicken function, that has gained two new arguments. First of all, by an idea of Adam Stone, you are now enabled to drop the original datetime variable from the data frame by using drop = ... [Read more...]

Predictability of Tennis Grand Slams

May 26, 2019 | That’s so Random

The European tennis season is in full swing, with Roland Garros starting today and Wimbledon taking place in a few weeks. For a sports buff like me, it is the essence of summer (together with the Tour de France). Time to dive into some tennis data. As a follower of ...
[Read more...]

Dealing with failed projects

November 22, 2018 | That’s so Random

Recently, I came up with Thoen’s law. It is an empirical one, based on several years of doing data science projects in different organisations. Here it is: The probability that you have worked on a data science project that failed, approaches one very quickly as the number of projects ... [Read more...]

Why your S3 method isn’t working

June 15, 2018 | That’s so Random

Throughout the last years I noticed the following happening with a number of people. One of those people was actually yours truely a few years back. Person is aware of S3 methods in R through regular use of print, plot and summary functions and decides to give it a go ... [Read more...]

A recipe for recipes

May 29, 2018 | That’s so Random

If you build statistical or machine learning models, the recipes package can be useful for data preparation. A recipe object is a container that holds all the steps that should be performed to go from the raw data set to the set that is fed into model a algorithm. Once ... [Read more...]

Make your own color palettes with paletti

December 22, 2017 | That’s so Random

Last week I blogged about the dutchmasters color palettes package, which was inspired by the wonderful ochRe package. As mentioned I shamelessly copied the package. I replaced the list with character vectors containing hex colors and did a find and replace to make it dutchmasters instead of ochRe. This was ...
[Read more...]

Color palettes derived from the Dutch masters

December 13, 2017 | That’s so Random

Among tulip fields, canals and sampling cheese, the museums of the Netherlands are one of its biggest tourist attractions. And for very good reasons! During the seventeenth century, known as the Dutch Golden Age, there was an abundance of talented painters. If you ever have the chance to visit the ...
[Read more...]

padr version 0.4.0 now on CRAN

November 17, 2017 | That’s so Random

I am happy to share that the latest version of padr just hit CRAN. This new version comprises bug fixes, performance improvements and new functions for formatting datetime variables. But above all, it introduces the custom paradigm that enables you to do asymmetric analysis.
[Read more...]

A ggplot-based Marimekko/Mosaic plot

November 1, 2017 | That’s so Random

One of my first baby steps into the open source world, was when I answered this SO question over four years ago. Recently I revisited the post and saw that Z.Lin did a very nice and more modern implementation, using dplyr and facetting in ggplot2. I decided to merge ...
[Read more...]

Tidy evaluation, most common actions

August 25, 2017 | That’s so Random

Tidy evaluation is a bit challenging to get your head around. Even after reading programming with dplyr several times, I still struggle when creating functions from time to time. I made a small summary of the most common actions I perform, so I don’t have to dig in the ... [Read more...]

Quickly Check your id Variables

July 20, 2017 | That’s so Random

Virtually every dataset has them; id variables that link a record to a subject and/or time point. Often one column, or a combination of columns, forms the unique id of a record. For instance, the combination of patient_id and visit_id, or ip_adress and visit_time. The ... [Read more...]

Check Data Quality with padr

June 26, 2017 | That’s so Random

The padr package was designed to prepare datetime data for analysis. That is, to take raw, timestamped data, and quickly convert it into a tidy format that can be analyzed with all the tidyverse tools. Recently, a colleague and I discovered a second use for the package that I had ...
[Read more...]

Here is the new padr

May 16, 2017 | That’s so Random

I am very happy to announce v0.3.0 of the padr package, which was introduced in January. As requested by many, you are now able to use intervals of which the unit is different from 1. In earlier version the eight interval values only allowed for a single unit (e.g. year, ...
[Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)