## Interview with a Data Scientist (Hadley Wickham)

August 2, 2015
Originally posted on Models are illuminating and wrong:I recently interviewed Hadley Wickham the creator of Ggplot2 and a famous R Stats person. He works for RStudio and his job is to work on Open Source software aimed at Data Geeks. Hadley is famous for his contributions to Data Science tooling and inspires a lot…

## Two New R Packages – qrencoder & passwordrandom

August 2, 2015
Believe it or not, there are two questions on @StackOverflowR about how to make QR codes in R. I personally think QR codes are kinda hokey, but...

## Playing with leafletR

August 2, 2015
Coming back from a bicycle ride in Tarn, I wanted to have a look at the trip. This gave me the opportunity to learn how to use the R...

## Seattle’s Fremont Bridge Bicyclists Again in the News

August 2, 2015
Back in 2013, David Smith had done analysis of bicycle trips across Seattle’s Fremont bridge. More recently, Jake Vanderplas (creator of Python’s very popular Scikit-learn package) wrote a nice...

## [ggtree] annotate phylogenetic tree with local images

August 1, 2015
In ggtree, we provide a function annotation_image for annotating phylogenetic tree with images. To demonstrate the usage, I created a tree view from a random tree as shown below:

## Streamgraphs in R

July 31, 2015
It's not easy to visualize a quantity that varies over time and which is composed of more than two subsegments. Take, for example, this stacked bar chart of religious...

## Rendering LaTeX Math Equations in GitHub Markdown

July 31, 2015
The Problem: GitHub README.md won't render LaTeX I have many times wondered about getting LaTeX math to render in a README file on GitHub. Apparently, many others ( 1,...

## Building a user interface for spatstat

July 30, 2015
Contents Introduction Get and run it! Roadmap Introduction RStudio developed shiny, an R package that, quoting from their website, “makes it super simple for R users...

## 15 Questions All R Users Have About Plots

July 30, 2015
R allows you to create different plot types, ranging from the basic graph types like density plots, dot plots, bar charts, line charts, pie charts, boxplots and scatter plots,...

## MRAN’s Packages Spotlight

July 30, 2015
by Joseph Rickert New R packages just keep coming. The following plot, constructed with information from the monthly files on Dirk Eddelbuettel's CRANberries site, shows a plot of the...

## The little mixed model that could, but shouldn’t be used to score surgical performance

July 30, 2015
The Surgeon Scorecard Two weeks ago, the world of medical journalism was rocked by the public release of ProPublica’s Surgeon Scorecard. In this project ProPublica “calculated death and complication...

## Empirical bias analysis of random effects predictions in linear and logistic mixed model regression

July 30, 2015
In the first technical post in this series, I conducted a numerical investigation of the biasedness of random effect predictions in generalized linear mixed models (GLMM), such as the ones...

## #rstats Make arrays into vectors before running table

July 29, 2015
Setup of Problem While working with nifti objects from the oro.nifti, I tried to table the values of the image. The table took a long time to compute. I...

## R Oddities: Strings in DataFrames

July 29, 2015
Have you ever read a file into R and then encountered strange problems filtering and sorting because the strings were converted to factors?  For...

## But I Don’t Want to Be a Statistician!

July 29, 2015
"For a long time I have thought I was a statistician.... But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt.... All...

## Mapping the past and the future with Leaflet

July 29, 2015
I have been working on mapping things for a while and I must say that I really like the Leaflet package from Rstudio. It makes it very easy and...

## Player Value Gap Assessment

July 29, 2015
Looking at fantasy football projections we have a group of experts providing their views on how a player will do during the football season. We have collected projections from...

## Predict Social Network Influence with R and H2O Ensemble Learning

July 29, 2015
What is H2O? H2O is an awesome machine learning framework. It is really great for data scientists and business analysts “who need scalable and fast machine learning”. H2O is...

## Hockey Elbow and Other Response Time Injuries

July 29, 2015
You've heard of tennis elbow. Well, there's a non-sports, performance injury that I like to call hockey elbow. An example of such an "injury" is shown in Figure...

## The most popular programming languages on StackOverflow

July 29, 2015
by Andrie de Vries Last week, IEEE Spectrum said R rised to #6 in Top Programming languages. They use a weighted methodology of 12 factors to compute their score....

## Introducing the nominatim geocoding package

July 29, 2015
In the never-ending battle for truth, justice and publishing more R packages than Oliver, I whipped out an R package for the OpenStreetMap Nominatim API. It actually hits the...

## Computing AIC on a Validation Sample

July 29, 2015
This afternoon, we’ve seen in the training on data science that it was possible to use AIC criteria for model selection. > library(splines) > AIC(glm(dist ~ speed, data=train_cars, family=poisson(link="log")))...

## Mongolite 0.5: authentication and iterators

July 28, 2015
A new version of the mongolite package has appeared on CRAN. Mongolite builds on jsonlite to provide a simple, high-performance...

## I loved this %>% crosstable

July 28, 2015
This is a public tank you for @heatherturner's contribution. Now the SciencesPo's crosstable can work in a chain (%>%) fashion; useful for using along with other packages that have...

## Pluto: To Catch an Icy King

July 28, 2015
Sly as a fox, it is. Mysterious and diminutive, it has eluded us for decades. Despite what we've learned about Pluto, constant debate continues to rage over its classification....

## Goals for the New R Consortium

July 28, 2015
by Bob Muenchen The recently-created R Consortium consists of companies that are deeply involved in R such as RStudio, Microsoft/Revolution Analytics, Tibco, and others. The Consortium’s goals include advancing...

## R tutorial on the Apply family of functions

July 28, 2015
Introduction In our previous tutorial Loops in R: Usage and Alternatives , we discussed one of the most important constructs in programming: the loop.  Eventually we deprecated the usage of loops in...

## Modelling Occurence of Events, with some Exposure

July 28, 2015
$Y_i^\star$

This afternoon, an interesting point was raised, and I wanted to get back on it (since I did publish a post on that same topic a long time ago)....