Lattice exercises – part 1

April 30, 2016
In the exercises below we will use the lattice package. First, we have to install this package with install.packages("lattice") and then we will call it library(lattice) . The Lattice package permits us to create univariate, bivariate and trivariate plots. For this set of exercises we will see univariate and bivariate plots. We will use a

First Thoughts on Automatically Generating Accessible Text Descriptions of ggplot Charts in R

April 29, 2016
In a course team accessibility briefing last week, Richard Walker briefly mentioned a tool for automatically generating text descriptions of Statistics Canada charts to support accessibility. On further probing, the tool, created by Leo Ferres, turned out to be called iGraph-Lite: … an extensible system that generates natural language descriptions of statistical graphs, particularly those

gap frequencies [& e]

April 28, 2016
A riddle from The Riddler where brute-force simulation does not pay: For a given integer N, pick at random without replacement integers between 1 and N by prohibiting consecutive integers until all possible entries are exhausted. What is the frequency of selected integers as N grows to infinity? A simple implementation of the random experiment

On Nested Models

April 26, 2016
We have been recently working on and presenting on nested modeling issues. These are situations where the output of one trained machine learning model is part of the input of a later model or procedure. I am now of the opinion that correct treatment of nested models is one of the biggest opportunities for improvement … Continue reading...

Learning R for Data Visualization [Video]

April 25, 2016
Last year Packt asked me to develop a video course to teach various techniques of data visualization in R. Since I love the idea of video courses and tutorials, and I also enjoy plotting data, I readily agreed.The result is this course, published last ...

Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)

April 22, 2016
This post will outline an easy-to-make mistake in writing vectorized backtests–namely in using a signal obtained at the end of … Continue reading →

yorkr crashes the IPL party! – Part 4

April 22, 2016
Introduction I’ve missed more than 9000 shots in my career. I’ve lost almost 300 games. 26 times, I’ve been trusted to take the game winning shot and missed. I’ve failed over and over and over again in my life. And that is why I succeed. Michael Jordan Success is where preparation and opportunity meet. Bobby

Installing SQL Server ODBC drivers on Ubuntu (in Travis-CI)

April 20, 2016
Did you know you can now get SQL Server ODBC drivers for Ubuntu? Yes, no, maybe? It’s ok even if you haven’t since it’s pretty new! Anyway, this presents me with an ideal opportunity to standardise my SQL Server ODBC connections across the operating systems I use R on i.e. Windows and Ubuntu. My first The post

Are R^2s Useful In Finance? Hypothesis-Driven Development In Reverse

April 18, 2016
This post will shed light on the values of R^2s behind two rather simplistic strategies — the simple 10 month … Continue reading →

Election analysis contest entry part 4 – drivers of preference for Green over Labour party

April 15, 2016
Motivation This post is the fourth in a series that make up my entry in Ari Lamstein’s R Election Analysis Contest. Earlier posts introduced the nzelect R package, basic usage, how it was built, and an exploratory Shiny web application. Today I follow up on discussion in the StatsChat blog. A post there showed a screen...