## dplyr: A gamechanger for data manipulation in R

August 19, 2014
By

I demonstrate how to use dplyr for data manipulation in R (R code and data on GitHub ). I had heard of the package before and finally gave it a try after attending Hadley Wickham's presentation at useR! in LA a couple of months ago. dplyr will change y...

## GBMs are awesome: Part I

August 19, 2014
By

GBMs have become my favorite type of model over the last two years. In this tutorial, I demonstrate how to use a GBM for binary classification in R (predicting...

August 19, 2014
By

After two pre-releases in the last few days, Conrad finalised a new Armadillo version 4.400 today. I had kept up with the pre-releases, tested twice against all eighty (!!) CRAN dependents...

## I like you and you like me…but what does it all mean. (Part 1)

August 19, 2014
By

Tinder is a popular matchmaking application that allows users to connect with others whom they share a physical attraction. New members build their profile by importing their age, gender, geographic information,...

## Recent Articles

August 19, 2014
By

I have uploaded a few papers I have written and presented at some national conferences over the past several years.  Currently, all the articles relate to election research.

## Integrating R with production systems using an HTTP API

August 19, 2014
By

by Nick Elprin, Co-Founder of Domino Data Lab We built a platform that lets analysts deploy R code to an HTTP server with one click, and we describe it...

## Hijacking R Functions: Changing Default Arguments

August 19, 2014
By

I am working on a package to collect common regular expressions into a canned collection that users can easily use without having to know regexes. The package, qdapRegex, has...

## Visualize pre-post comparison of intervention #rstats

August 19, 2014
By

My sjPlot-package was just updated on CRAN, introducing a new function called sjp.emm.int to plot estimated marginal means (least-squares means) of linear models with interaction terms. Or: plotting adjusted...

## Introducing Rfiglet: ASCII logos from the comfort of R

August 19, 2014
By

The Rfiglet Package For those who don't know what figlet is, it's a command line utility for creating ascii logos.  Rfiglet, therefore, is a set of R bindings for...

## Transform point shapefile to SpatStat object

August 19, 2014
By

Today I wanted to do some point pattern analysis in R using the fantastic package spatstat.The problem was that I only had a point shapefile, so I googled a...

August 19, 2014
By

Earlier this week we released googleVis 0.5.5 on CRAN. The package provides an interface between R and Google Charts, allowing you to create interactive web charts from R. This...

## analyze the programme for the international assessment of adult competencies (piaac) with r

August 19, 2014
By

heaven knows we've all been there: you're in a heated argument with some patriotic zealot who thinks (insert country here) has the best labor force on earth.  you know...

## Data Cleaning is a critical part of the Data Science process

August 18, 2014
By

A New York Times article yesterday discovers the 80-20 rule: that 80% of a typical data science project is sourcing cleaning and preparing the data, while the remaining 20%...

## A Conversation with Tal Galili at useR! 2014

August 18, 2014
By

“One can acquire everything in solitude except character.” ― Stendhal The Interview Tal Galili is,...

## Announcing the DSLA Podcast!

August 18, 2014
By

You’ve asked and we’ve listened. The audio content from our DataScience.LA interviews will now be...

## What are the Odds of an Independent Scotland?

August 18, 2014
By

“For things to remain the same, everything must change.” (Gattopardo by Giuseppe Tomasi di Lampedusa) In less than a month, Scots will decide if they want Scotland tied or...

## Example 2014.10: Panel by a continuous variable

August 18, 2014
By

In Example 8.40, side-by-side histograms, we showed how to generate histograms for some continuous variable, for each level of a categorical variable in a data set. An anonymous...

## A Hammer Trading System — Demonstrating Custom Indicator-Based Limit Orders in Quantstrat

August 18, 2014
By

So several weeks ago, I decided to listen on a webinar (and myself will be giving one on using quantstrat … Continue reading →

## Goodbye static graphs, hello shiny, ggvis, rmarkdown (vs JS solutions)

August 18, 2014
By

One of the very exciting and promising developments from RStudio is the rmarkdown/shiny/ggvis combination of tools. We’re on the verge of static graphs and presentations being as old-fashioned as...

## GEFCom 2014 energy forecasting competition is underway

August 17, 2014
By

GEFCom 2014 is the most advanced energy forecasting competition ever organized, both in terms of the data involved, and in terms of the way the forecasts will be evaluated....

## Hayward/San Leandro Housing Prices

I’ve done a previous post about the salaries of data scientists, but now I’m going to look at one of the negative sides of all the high salaries generated by the...

## Changes to FSA — Estimating Abundance

August 17, 2014
By

I mentioned previously, that I have been updating the Mark-Recapture vignettes.  That has morphed into a document that is an update of the Mark-Recapture Closed and Open vignettes and...

August 17, 2014
By

## Quicksort speed, just in time compiling and vectorizing

August 17, 2014
By

I was reading the Julia documentation the other day. They do speed comparisons to other languages. Obviously R does not come out very well. The R code for quicksort...

## A Look at Random Seeds in R… Or: “85, why can’t you be more like 548?”

August 17, 2014
By

Have you ever wondered whether the set.seed() function in R has any quirkiness? This analysis was inspired by a Stack Overflow posting by Wolfgang and I incorporate...

## A Matrix Powers Package, and Some General Edifying Material on R

August 16, 2014
By

Here I will introduce matpow, a package to flexibly and conveniently compute matrix powers.  But even if you are not interested in matrices, I think many of you will...

## Search for CRAN, GitHub and BioConductor packages at Rdocumentation.org

August 15, 2014
By

If you're looking for just the right package to solve your R problem, you could always browse through the list of available packages on CRAN. But with almost 6000...

## Reasonable Inheritance of Cluster Identities in Repetitive Clustering

August 15, 2014
By

… or Inferring Identity from Observations Let’s assume the following application: A conservation organisation starts a project to geographically catalogue the remaining representatives of an endangered plant species. For...