December 2018

Predicting Churn Using Tree Models

December 31, 2018 | 0 Comments

Today I want to predict churn using data from a hypothetical telecom company. Although it isn’t real life data, it is based on real life data. The data are spread across 19 columns — 14 continuous, 4 categorical, and the outcome variable for prediction - “churn”. The dataset is small, with 3333 rows for ... [Read more...]

Your and my 2019 R goals

December 31, 2018 | 0 Comments

Here we go again, using a Twitter trend as blog fodder! Colin Fay launched an inspiring movement by sharing his R goals of 2019. My #RStats goals for 2019: 1️⃣ Becoming entirely fluent with {data.table}2️⃣ Getting at ease with {Rcpp} What are yours?#rdatatable #rcpp— Colin Fay 🤘 (@_ColinFay) December 29, 2018 It’s been ...
[Read more...]

Introducing RcppDynProg

December 31, 2018 | 0 Comments

RcppDynProg is a new Rcpp based R package that implements simple, but powerful, table-based dynamic programming. This package can be used to optimally solve the minimum cost partition into intervals problem (described below) and is useful in building piecewise estimates of functions (shown in this note). The abstract problem The ...
[Read more...]

Leaf Plant Classification: Statistical Learning Model – Part 2

December 30, 2018 | 0 Comments

CategoriesAdvanced Modeling Tags Linear Regression Principal Component Analysis R Programming In this post, I am going to build a statistical learning model as based upon plant leaf datasets introduced in part one of this tutorial. We have available three datasets, each one providing sixteen samples each of one-hundred plant species. ...
[Read more...]

End of 2018 Thoughts

December 30, 2018 | 0 Comments

Here we are, ending yet another year and starting something new. I wanted to take a minute to dwell on 2018, what happened in my R world this year and reflexions I take from that. public speaking This year started for me with a bang - as a result of my ... [Read more...]

Leaf Plant Classification: An Exploratory Analysis – Part 1

December 29, 2018 | 0 Comments

CategoriesGetting Data Tags Data Management Data Visualisation Exploratory Analysis R Programming In this post, I am going to run an exploratory analysis of the plant leaf dataset as made available by UCI Machine Learning repository at this link. The dataset is expected to comprise sixteen samples each of one-hundred plant ...
[Read more...]

Part 5: Code corrections to optimism corrected bootstrapping series

December 29, 2018 | 0 Comments

The truth is out there R readers, but often it is not what we have been led to believe. The previous post examined the strong positive results bias in optimism corrected bootstrapping (a method of assessing a machine learning model’s predictive power) with increasing p (completely random features). There ...
[Read more...]

Tidymodels

December 28, 2018 | 0 Comments

Introduction Packages CRAN availability of tidymodels packages: Unified Modelling Syntax Statistical Tests and Model Selection Resampling, Feature Engineering and Performance Metrics Modeling Data Response Variable lstat Correlations lstat vs categorical variables Preprocessing with recipe Summary Recipe Resampling with rsample Modelling with caret Wrapper Apply Wrapper Assess Performance with yardstick Parameters ...
[Read more...]

Part 4: Why does bias occur in optimism corrected bootstrapping?

December 28, 2018 | 0 Comments

In the previous parts of the series we demonstrated a positive results bias in optimism corrected bootstrapping by simply adding random features to our labels. This problem is due to an ‘information leak’ in the algorithm, meaning the training and test datasets are not kept seperate when estimating the optimism. ...
[Read more...]

Using emojis as scatterplot points

December 27, 2018 | 0 Comments

Recently I wanted to learn how to use emojis as points in a scatterplot points. It seems like the emojifont package is a popular way to do it. However, I couldn’t seem to get it to work on my machine … Continue reading →
[Read more...]

My R Take on Advent of Code – Day 3

December 27, 2018 | 0 Comments

Ho, ho, ho, Happy Chris.. New Year? Between eating the sea of fish (as the Polish tradition requires), assembling doll houses and designing a new kitchen, I finally managed to publish the third post on My R take on Advent of Code. To keep things short and sweet, here’s ... [Read more...]

My #Best9of2018 tweets

December 27, 2018 | 0 Comments

As 2018 nears its end, it’s time for me to look back on my R/Twitter year with the same simple method as last year: let me identify and webshoot my 9 best tweets of 2018! Downloading and opening my Twitter data Like in 2017 I tweeted too much and therefore was unable ...
[Read more...]

French Mortality Poster

December 27, 2018 | 0 Comments

Based on the heatmaps I drew earlier this month, I made a poster of two centuries of data on mortality rates in France for males and females. It turned out reasonably well, I think. I will probably get it blown up to a nice large size and put it up ... [Read more...]
1 2 3 13

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)