Tutorials freely available of course I taught: including ggplot2, dplyr and shiny

September 5, 2015
By

I was asked to write a R course for a group of innovative companies in the North of the Netherlands. The group of 12 people was a mix of engineers and programmers, and the course aimed at giving them a… See more ›

Read more »

Bootstrap Evaluation of Clusters

September 4, 2015
By
Bootstrap Evaluation of Clusters

Illustration from Project Gutenberg The goal of cluster analysis is to group the observations in the data into clusters such that every datum in a cluster is more similar...

Read more »

Revolution R Open 3.2.2 now available

September 4, 2015
By
Revolution R Open 3.2.2 now available

Revolution R Open, the enhanced open source R distribution from Revolution Analytics and Microsoft, is now available for download. This update brings multi-threaded performance to the latest update to...

Read more »

dplyr 0.4.3

September 4, 2015
By
dplyr 0.4.3

dplyr 0.4.3 includes over 30 minor improvements and bug fixes, which are described in detail in the release notes. Here I wanted to draw your attention five small, but...

Read more »

Linear models with weighted observations

September 4, 2015
By
Linear models with weighted observations

In data analysis it happens sometimes that it is neccesary to use weights. Contexts that come to mind include: Analysis of data from complex surveys, e.g. stratified samples. Sample...

Read more »

Accept payments in shiny app

September 4, 2015
By
Accept payments in shiny app

Have you ever think about accepting payments in your shiny app? Probably not, but now you can start ;) Shiny apps are usually single task, not very heavy websites. It may...

Read more »

RcppArmadillo 0.5.500.2.0

September 3, 2015
By
RcppArmadillo 0.5.500.2.0

Once again time for the monthly upstream Armadillo update -- version 5.500.2 was released earlier today by Conrad. And a new and matching...

Read more »

ABC model choice via random forests [and no fire]

September 3, 2015
By
ABC model choice via random forests [and no fire]

While my arXiv newspage today had a puzzling entry about modelling UFOs sightings in France, it also broadcast our revision of Reliable ABC model choice via random forests, version...

Read more »

On NCDF Climate Datasets

September 3, 2015
By
On NCDF Climate Datasets

Mid november, a nice workshop on big data and environment will be organized, in Argentina, We will talk a lot about climate models, and I wanted to play a little...

Read more »

Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses

September 3, 2015
By
Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses

This post will begin to apply a hypothesis-driven development framework (that is, the framework written by Brian Peterson on how … Continue reading →

Read more »

How do you know if your model is going to work? Part 1: The Problem

September 3, 2015
By
How do you know if your model is going to work? Part 1: The Problem

by John Mount (more articles) and Nina Zumel (more articles) of Win-Vector LLC "Essentially, all models are wrong, but some are useful." George Box Here's a caricature of a...

Read more »

Free R Help

September 3, 2015
By
Free R Help

Today I am giving away 10 sessions of free, online, one-on-one R help. My hope is to get a better understanding of how my readers use R, and the...

Read more »

xkcd survey and the power to shape the internet

September 2, 2015
By
xkcd survey and the power to shape the internet

The xkcd survey If you’ve never heard of xkcd, it’s “ webcomic of romance, sarcasm, math, and language” created by Randall Munroe. Also, if you’ve never heard of...

Read more »

Logistic Regression in R – Part Two

September 2, 2015
By
Logistic Regression in R – Part Two

My previous post covered the basics of logistic regression. We must now examine the model to understand how well it fits the data and generalizes to other observations. The...

Read more »

reaching transcendence for Gaussian mixtures

September 2, 2015
By
reaching transcendence for Gaussian mixtures

“…likelihood inference is in a fundamental way more complicated than the classical method of moments.” Carlos Amendola, Mathias Drton, and Bernd Sturmfels arXived a paper this Friday on “maximum...

Read more »

How do you know if your model is going to work? Part 1: The problem

September 2, 2015
By
How do you know if your model is going to work? Part 1: The problem

Authors: John Mount (more articles) and Nina Zumel (more articles). “Essentially, all models are wrong, but some are useful.” George Box Here’s a caricature of a data science project:...

Read more »

Top 10 Reasons why you should attend the EARL R London Conference

September 2, 2015
By
Top 10 Reasons why you should attend the EARL R London Conference

On September 14th-16th Mango Solutions are running the EARL ( Effective Applications of the R Language) Conference for all users, enthusiasts and beginners of the R programming language. It...

Read more »

Using the googlesheets package to work with Google Sheets

September 2, 2015
By

by Andrie de Vries Just more than a year ago I cobbled together some code to work with the (then) new version of Google Sheets. You can still find...

Read more »

Correction For Spatial And Temporal Auto-Correlation In Panel Data: Using R To Estimate Spatial HAC Errors Per Conley

September 2, 2015
By
Correction For Spatial And Temporal Auto-Correlation In Panel Data: Using R To Estimate Spatial HAC Errors Per Conley

Darin Christensen and Thiemo Fetzer tl;dr: Fast computation of standard errors that allows for serial and spatial auto-correlation. Economists and political scientists often employ panel data that track units...

Read more »

Mathematical annotations on R plots

September 2, 2015
By
Mathematical annotations on R plots

I’ve always struggled with using plotmath via the expression function in R for adding mathematical notation to axes or legends. For some reason, the most obvious way to write...

Read more »

Unit Converter

September 1, 2015
By

Introduction Dan continues to crank out book chapter-length posts, which probably means that I should jump in before getting further behind…so here we go. In the next few posts, I’d...

Read more »

Logistic Regression in R – Part One

September 1, 2015
By
Logistic Regression in R – Part One

Please note that an earlier version of this post had to be retracted because it contained some content which was generated at work. I have since chosen to rewrite...

Read more »

Yahoo Finance (CSI) Data Quirks. Or Why is the ROC not Stable?

September 1, 2015
By
Yahoo Finance (CSI) Data Quirks. Or Why is the ROC not Stable?

Rotational strategies on ETFs have been a common occurrence on this blog, and I have been using something similar for real life trading for about two years now. Readers...

Read more »

Looking after Datasets

September 1, 2015
By
Looking after Datasets

by Antony Unwin University of Augsburg, Germany David Moore's definition of data: numbers that have been given a context. Here is some context for the finch dataset: Fig 1:...

Read more »

Learning Italian with rvest and Duolingo

September 1, 2015
By
Learning Italian with rvest and Duolingo

  By Aimee Gott,  R Consultant, Mango Solutions Over the last month I have found multiple reasons for needing to scrape web pages for information. This started out with...

Read more »

Reasons to Learn R

September 1, 2015
By
Reasons to Learn R

A new blog post over at Pluralsight describes reasons R has been generating a great deal of interest in recent days:  http://blog.pluralsight.com/r-programming-language.

Read more »

Bayesian regression models using Stan in R

September 1, 2015
By
Bayesian regression models using Stan in R

It seems the summer is coming to end in London, so I shall take a final look at my ice cream data that I have been playing around with...

Read more »

About to teach Statistical Graphics and Visualization course at CMU

August 31, 2015
By
About to teach Statistical Graphics and Visualization course at CMU

I’m pretty excited for tomorrow: I’ll begin teaching the Fall 2015 offering of 36-721, Statistical Graphics and Visualization. This is a half-semester course designed primarily for students in our...

Read more »

likelihood-free inference in high-dimensional models

August 31, 2015
By
likelihood-free inference in high-dimensional models

“…for a general linear model (GLM), a single linear function is a sufficient statistic for each associated parameter…” The recently arXived paper “Likelihood-free inference in high-dimensional models“, by Kousathanas...

Read more »