ABC model choice via random forests [and no fire]

September 3, 2015
By
ABC model choice via random forests [and no fire]

While my arXiv newspage today had a puzzling entry about modelling UFOs sightings in France, it also broadcast our revision of Reliable ABC model choice via random forests, version that we resubmitted today to Bioinformatics after a quite thorough upgrade, the most dramatic one being the realisation we could also approximate the posterior probability of

Read more »

Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses

September 3, 2015
By
Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses

This post will begin to apply a hypothesis-driven development framework (that is, the framework written by Brian Peterson on how … Continue reading →

Read more »

How do you know if your model is going to work? Part 1: The Problem

September 3, 2015
By
How do you know if your model is going to work? Part 1: The Problem

by John Mount (more articles) and Nina Zumel (more articles) of Win-Vector LLC "Essentially, all models are wrong, but some are useful." George Box Here's a caricature of a...

Read more »

Free R Help

September 3, 2015
By
Free R Help

Today I am giving away 10 sessions of free, online, one-on-one R help. My hope is to get a better understanding of how my readers use R, and the...

Read more »

Logistic Regression in R – Part Two

September 2, 2015
By
Logistic Regression in R – Part Two

My previous post covered the basics of logistic regression. We must now examine the model to understand how well it fits the data and generalizes to other observations. The...

Read more »

reaching transcendence for Gaussian mixtures

September 2, 2015
By
reaching transcendence for Gaussian mixtures

“…likelihood inference is in a fundamental way more complicated than the classical method of moments.” Carlos Amendola, Mathias Drton, and Bernd Sturmfels arXived a paper this Friday on “maximum...

Read more »

How do you know if your model is going to work? Part 1: The problem

September 2, 2015
By
How do you know if your model is going to work? Part 1: The problem

Authors: John Mount (more articles) and Nina Zumel (more articles). “Essentially, all models are wrong, but some are useful.” George Box Here’s a caricature of a data science project:...

Read more »

Top 10 Reasons why you should attend the EARL R London Conference

September 2, 2015
By
Top 10 Reasons why you should attend the EARL R London Conference

On September 14th-16th Mango Solutions are running the EARL ( Effective Applications of the R Language) Conference for all users, enthusiasts and beginners of the R programming language. It...

Read more »

Using the googlesheets package to work with Google Sheets

September 2, 2015
By

by Andrie de Vries Just more than a year ago I cobbled together some code to work with the (then) new version of Google Sheets. You can still find...

Read more »

Correction For Spatial And Temporal Auto-Correlation In Panel Data: Using R To Estimate Spatial HAC Errors Per Conley

September 2, 2015
By
Correction For Spatial And Temporal Auto-Correlation In Panel Data: Using R To Estimate Spatial HAC Errors Per Conley

Darin Christensen and Thiemo Fetzer tl;dr: Fast computation of standard errors that allows for serial and spatial auto-correlation. Economists and political scientists often employ panel data that track units...

Read more »

Mathematical annotations on R plots

September 2, 2015
By
Mathematical annotations on R plots

I’ve always struggled with using plotmath via the expression function in R for adding mathematical notation to axes or legends. For some reason, the most obvious way to write...

Read more »

Unit Converter

September 1, 2015
By

Introduction Dan continues to crank out book chapter-length posts, which probably means that I should jump in before getting further behind…so here we go. In the next few posts, I’d...

Read more »

Logistic Regression in R – Part One

September 1, 2015
By
Logistic Regression in R – Part One

Please note that an earlier version of this post had to be retracted because it contained some content which was generated at work. I have since chosen to rewrite...

Read more »

Yahoo Finance (CSI) Data Quirks. Or Why is the ROC not Stable?

September 1, 2015
By
Yahoo Finance (CSI) Data Quirks. Or Why is the ROC not Stable?

Rotational strategies on ETFs have been a common occurrence on this blog, and I have been using something similar for real life trading for about two years now. Readers...

Read more »

Looking after Datasets

September 1, 2015
By
Looking after Datasets

by Antony Unwin University of Augsburg, Germany David Moore's definition of data: numbers that have been given a context. Here is some context for the finch dataset: Fig 1:...

Read more »

Learning Italian with rvest and Duolingo

September 1, 2015
By
Learning Italian with rvest and Duolingo

  By Aimee Gott,  R Consultant, Mango Solutions Over the last month I have found multiple reasons for needing to scrape web pages for information. This started out with...

Read more »

Reasons to Learn R

September 1, 2015
By
Reasons to Learn R

A new blog post over at Pluralsight describes reasons R has been generating a great deal of interest in recent days:  http://blog.pluralsight.com/r-programming-language.

Read more »

Bayesian regression models using Stan in R

September 1, 2015
By
Bayesian regression models using Stan in R

It seems the summer is coming to end in London, so I shall take a final look at my ice cream data that I have been playing around with...

Read more »

About to teach Statistical Graphics and Visualization course at CMU

August 31, 2015
By
About to teach Statistical Graphics and Visualization course at CMU

I’m pretty excited for tomorrow: I’ll begin teaching the Fall 2015 offering of 36-721, Statistical Graphics and Visualization. This is a half-semester course designed primarily for students in our...

Read more »

likelihood-free inference in high-dimensional models

August 31, 2015
By
likelihood-free inference in high-dimensional models

“…for a general linear model (GLM), a single linear function is a sufficient statistic for each associated parameter…” The recently arXived paper “Likelihood-free inference in high-dimensional models“, by Kousathanas...

Read more »

Revolution R Enterprise Now Available in the Cloud on Azure Marketplace

August 31, 2015
By

by Richard Kittler, Revolution R Enterprise PM, Microsoft Advanced Analytics Revolution is excited to announce the availability of its latest release of Revolution R Enterprise 7.4.1 (RRE) as a...

Read more »

Two little annoying stats detail

August 31, 2015
By
Two little annoying stats detail

A very brief post at the end of the field season on two little “details” that are annoying me in paper/analysis that I see being done (sometimes) around me....

Read more »

16 new R jobs (from R-users.com ; 2015-08-31)

August 31, 2015
By
16 new R jobs (from R-users.com ; 2015-08-31)

This is the bimonthly post (for 2015-08-31) for new R Jobs from R-users.com. Employers: you may visit this link to post a new R job to the R community (it’s free and quick). Job seekers: please follow the links below...

Read more »

embeding a subplot in ggplot via subview

August 30, 2015
By
embeding a subplot in ggplot via subview

I implemented a function, subview, in ggtree that make it easy to embed a subplot in ggplot. An example is shown below: Read more »

Visualizing Twitter history with streamgraphs in R

August 30, 2015
By

I was exploring ways to visualize my Twitter history, and ended up creating this interactive streamgraph of my 20 most used hashtags in Twitter: The graph shows how my...

Read more »

GEOSTAT 2015: a write-up

August 30, 2015
By
GEOSTAT 2015: a write-up

The week before last I attended the GEOSTAT summer school in Lancaster. GEOSTAT is an annual week-long meeting devoted to ‘geostatistics’ (or ‘spatial statistics’ - we’ll come on...

Read more »

RcppGSL 0.3.0

August 30, 2015
By

A new version of RcppGSL just arrived on CRAN. The RcppGSL package provides an interface from R to the GNU GSL using our

Read more »

“A 99% TVaR is generally a 99.6% VaR”

August 29, 2015
By
“A 99% TVaR is generally a 99.6% VaR”

Almost 6 years ago, I posted a brief comment on a sentence I found surprising, by that time, discovered in a report claiming that the expected shortfall  at the 99 % level corresponds...

Read more »

Building Wordclouds in R

August 28, 2015
By
Building Wordclouds in R

In this article, I will show you how to use text data to build word clouds in R. We will use a dataset containing around 200k Jeopardy questions. The...

Read more »