Blog Archives

R Work Areas. Standardize and Automate.

January 10, 2018
By
R Work Areas. Standardize and Automate.

Before beginning work on a new data science project I like to do the following: 1. Get my work area ready by creating an R Project for use with the RStudio IDE. 2. Organize my work area by creating a series of directories to store my project inputs and outputs. I create ‘data’ (raw data), … Continue reading R...

Read more »

Helpful Data Science Reads

January 2, 2018
By
Helpful Data Science Reads

Here are some of the books that I found interesting and useful in 2017. Scrum: The Art of Doing Twice the Work in Half the Time by Jeff Sutherland Jeff Sutherland, one of the creators of the scrum methodology of project management lays down the rational for adopting scrum over more traditional project management frameworks. … Continue reading Helpful...

Read more »

NZ Real GDP htmlwidget

August 5, 2016
By
NZ Real GDP htmlwidget

Thought I would try my hand at generating an interactive JavaScript line graph using R. Thankfully the dygraphs package makes this very easy! The code below generates an interactive plot of New Zealand’s real GDP through time. I have added some annotations displaying some of the major financial crises. It is as if the economy fell

Read more »

Good Parameterisation in R

July 10, 2016
By
Good Parameterisation in R

Imagine you work in a large factory that produces complicated widgets. It is your job to control production line settings which must be reset each day so as to ensure the smooth operation of the factory. However, to change the settings you have to walk around turning dials and pressing buttons at various different locations

Read more »

Pretty Data Class Conversion

April 8, 2016
By
Pretty Data Class Conversion

Load data – check structure – convert – analyse. Data class conversion is essential to gaining the right result… especially if you have left stringsAsFactors = TRUE. The worst thing you can do is feed factor data into a function when you expected it to be characters. If system memory is not a concern, I

Read more »

Demystifying the GLM (Part 1)

February 11, 2016
By
Demystifying the GLM (Part 1)

Upon being thrown a prickly binary classification problem, most data practitioners will have dug deep into their statistical tool box and pulled out the trusty logistic regression model. Essentially, logistic regression can help us predict a binary (yes/no) response with consideration given to other, hopefully related, variables. For example, one might want to predict whether

Read more »

NZ’s Shifting Makeup

December 17, 2015
By
NZ’s Shifting Makeup

New Zealand is culturally diverse. Even at a regional level, there are big differences in ethnic composition… and with an increasingly inter-connected world, ethnic composition is expected to change substantially in the future, particularly in Auckland. Statistics New Zealand has provided us with sub-national ethnic population projections, by age and sex, from 2013 to 2038

Read more »

A Matter of Style?

December 3, 2015
By
A Matter of Style?

Up until a few weeks ago I would style my code like this: I thought that was the only way… until I witnessed a DBA friend of mine coding. He would write the same function like this: In my opinion, the second style makes the code easier to read. I suspect it is something to

Read more »

Trying to Win with R

November 20, 2015
By
Trying to Win with R

A common competition run by vendors of fishing equipment is a ‘guess the weight and win’ where an image of someone holding a fish is posted and it is up to you to guess it’s weight with the closest guess winning a prize. The ‘law of large numbers’ implies that the average of the guesses

Read more »

Working with Data Frames in Python and R

November 19, 2015
By
Working with Data Frames in Python and R

Originally posted on Data Hipsters: Data frame objects facilitate most data analysis exercises in both R and Python (perhaps with the exception of time series analysis, where the focus is on R time series and Pandas series objects). Data frames are a tidy and meaningful way to store data. This post will display exactly the…

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)