EARL2015 Conference, Boston – Full Agenda Announced

July 16, 2015
By
EARL2015 Conference, Boston  – Full Agenda Announced

By Liz Matthews, Marketing and Events UK We are delighted to announce the full agenda for the EARL2015 Conference which takes place at the Microsoft NERD Center in Boston in November.  The conference will feature : 4 pre-conference workshops 4 Keynote … Continue reading →

Read more »

R, Extreme Value Statistics and Missing Data

July 16, 2015
By
R, Extreme Value Statistics and Missing Data

by Joseph Rickert June was a hot month for extreme statistics and R. Not only did we close out the month with useR! 2015, but two small conferences in the middle of the month brought experts together from all over the world to discuss two very difficult areas of statistics that generate quite a bit of R code. The...

Read more »

New version of assertive and answers to tutorial exercises

July 16, 2015
By
New version of assertive and answers to tutorial exercises

I gave a tutorial at useR on testing R code, which turned out to be a great way of getting feedback on my code! Based on the suggestions by attendees, I’ve made a big update to the package, which is now on CRAN. Full details of the new features can be access in the ?changes

Read more »

Getting that X with the Glog function and Lambert’s W

July 16, 2015
By
Getting that X with the Glog function and Lambert’s W

Facing a simple, yet frustrating formula like this and the task to solve it for x left me googling around for hours until I found salvation in Wolfram Alpha, Wikipedia, and a nice blog post with R-syntax to solve a similar equation. Using the results from Wolfram Alpha I was able to find the solution with the ‘gsl’

Read more »

Murphy diagrams in R

July 16, 2015
By
Murphy diagrams in R

At the recent International Symposium on Forecasting, held in Riverside, California, Tillman Gneiting gave a great talk on “Evaluating forecasts: why proper scoring rules and consistent scoring functions matter”. It will be the subject of an IJF invited paper in due course. One of the things he talked about was the “Murphy diagram” for comparing forecasts,

Read more »

RcppEigen 0.3.2.5.0

July 15, 2015
By

A new release of RcppEigen arrived on CRAN and in Debian yesterday. It synchronizes the Eigen code with the 3.2.5 upstream release. Once again, Yixuan Qiu did most of the heavy lifting in one very nice pull request, and I added some minor updates to ...

Read more »

Working with Sessionized Data 2: Variable Selection

July 15, 2015
By

In our previous post in this series, we introduced sessionization, or converting log data into a form that’s suitable for analysis. We looked at basic considerations, like dealing with time, choosing an appropriate dataset for training models, and choosing appropriate (and achievable) business goals. In that previous example, we sessionized the data by considering all … Continue reading...

Read more »

Article Spotlight: Persistent data storage in Shiny apps

July 15, 2015
By
Article Spotlight: Persistent data storage in Shiny apps

The articles section on shiny.rstudio.com has lots of great advice for Shiny developers. A recent article by Dean Attali demonstrates how to save data from a Shiny app to persistent storage structures, like local files, servers, databases, and more. When you do this, your data remains after the app has closed, which opens new doors

Read more »

Seeing Data as the Product of Underlying Structural Forms

July 15, 2015
By
Seeing Data as the Product of Underlying Structural Forms

Matrix factorization follows from the realization that nothing forces us to accept the data as given. We start with objects placed in rows and record observations on those objects arrayed along the top in columns. Neither the objects nor the measuremen...

Read more »

St Swithun’s Day simulator

July 15, 2015
By
St Swithun’s Day simulator

I got a bit bored (sorry Mike), and wrote this. I didn’t take long (I tell you that not so much to cover my backside as to celebrate the majesty of R). First, I estimated probabilities of a day being … Continue reading →

Read more »

Leave the Pima Indians alone!

July 14, 2015
By
Leave the Pima Indians alone!

“…our findings shall lead to us be critical of certain current practices. Specifically, most papers seem content with comparing some new algorithm with Gibbs sampling, on a few small datasets, such as the well-known Pima Indians diabetes dataset (8 covariates). But we shall see that, for such datasets, approaches that are even more basic than

Read more »

Easy Bayesian Bootstrap in R

July 14, 2015
By
Easy Bayesian Bootstrap in R

A while back I wrote about how the classical non-parametric bootstrap can be seen as a special case of the Bayesian bootstrap. Well, one difference between the two methods is that, while it is straightforward to roll a classical bootstrap in R, there is no easy way to do a Bayesian bootstrap. This post, in an attempt to...

Read more »

The 2015 Big Data Summit, 9-10 August 2015, collocated with ACM KDD 2015, Sydney

July 14, 2015
By
The 2015 Big Data Summit, 9-10 August 2015, collocated with ACM KDD 2015, Sydney

The 2015 Big Data Summit 9-10 August 2015 collocated with ACM KDD 2015, Sydney URL: http://2015.bigdatasummit.co/ We take this privilege opportunity to invite you to participate in the 2015 Big Data Summit: • Co-located with ACM KDD2015 • Plenary sessions … Continue reading →

Read more »

MazamaSpatialUtils — Ebola Map Example

July 14, 2015
By
MazamaSpatialUtils — Ebola Map Example

This entry is part 16 of 16 in the series Using R The MazamaSpatialUtils package on CRAN has just been updated with additional shape file conversion scripts and location buffering so that points located just outside of polygons (i.e. coastal …   read more ...

Read more »

Chronicles from useR! – “the funny side”

July 14, 2015
By
Chronicles from useR! – “the funny side”

Dear R users, ten days after the conference here you ar

Read more »

A Simple Intro to Bayesian Change Point Analysis

July 14, 2015
By
A Simple Intro to Bayesian Change Point Analysis

The purpose of this post is to demonstrate change point analysis by stepping through an example of change point analysis in R presented in Rizzo’s excellent, comprehensive, and very mathy book, Statistical

Read more »

R 101 – Aggregate By Quarter

July 14, 2015
By
R 101 – Aggregate By Quarter

We were asked a question on how to (in R) aggregate quarterly data from what I believe was a daily time series. This is a pretty common task and there are many ways to do this in R, but we’ll focus on one method using the zoo and dplyr packages. Let’t get those imports out of the way: library(dplyr) library(zoo) library(ggplot2) Now, we need...

Read more »

5 Steps to Create an R Package Email Course

July 14, 2015
By
5 Steps to Create an R Package Email Course

by Ari Lamstein, Software Engineer and Data Analyst Creating an email course for my R packages has significantly increased the number of people who use the packages. It has also reduced the learning curve for the packages and brought me into greater contact with my users. In this post I will share the 5 steps I took to create...

Read more »

Spark 1.4 for RStudio

July 14, 2015
By
Spark 1.4 for RStudio

Today’s guest post is written by Vincent Warmerdam of GoDataDriven and is reposted with Vincent’s permission from blog.godatadriven.com. You can learn more about how to use SparkR with RStudio at the 2015 EARL Conference in Boston November 2-4, where Vincent will be speaking live. This document contains a tutorial on how to provision a spark

Read more »

Overview of R Workshops at London EARL 2015

July 14, 2015
By
Overview of R Workshops at London EARL 2015

EARL (Effective Applications of the R Language) is a Conference organised by Mango Solutions for users and developers of the open source R programming language. The primary focus of the Conference will be the commercial usage of R across a range … Continue reading →

Read more »

ChainLadder 0.2.1 released

July 14, 2015
By
ChainLadder 0.2.1 released

Over the weekend we released version 0.2.1 of the ChainLadder package for claims reserving on CRAN. New FeaturesNew function PaidIncurredChain by Fabio Concina, based on the 2010 Merz & Wüthrich paper Paid-incurred chain claims reserving methodFunctions plot.MackChainLadder and plot.BootChainLadder gained new argument which, allowing users to specify which sub-plot to display. Thanks to Christophe...

Read more »

What (Really) is a Data Scientist?

July 13, 2015
By
What (Really) is a Data Scientist?

What is a data scientist? What makes for a good (or great!) data scientist? It’s been challenging enough to determine what a data scientist really is (several people have proposed ways to

Read more »

Using R to analyze pro motorcycle racing

July 13, 2015
By
Using R to analyze pro motorcycle racing

EMC recently ran a competion to find out why John McGuinness, the legendary motorcycle racer known as the "Morecambe Missile", is outperforms the average motorcycle racer. To answer this question, EMC instrumented his bike and his suit with a number of real-time sensors. (Data collected included gear and RPM for the bike, and heart rate and acceleration for the...

Read more »

“Don’t invert that matrix” – why and how

July 13, 2015
By
“Don’t invert that matrix” – why and how

The first time I read John Cook’s advice “Don’t invert that matrix,” I wasn’t sure how to follow it. I was familiar with manipulating matrices analytically (with pencil and paper) for statistical derivations, but not with implementation details in software. … Continue reading →

Read more »

RStudio and GitHub

July 13, 2015
By
RStudio and GitHub

Version control has become essential for me keeping track of projects, as well as collaborating. It allows backup of scripts and easy collaboration on complex projects. RStudio works really well with Git, an open source open source distributed version control system, and GitHub, a web-based Git repository hosting service. I was always forget how to

Read more »

Scatter Plots with Marginal Densities – An Example for Doing Exploratory Data Analysis with Tableau and R

July 13, 2015
By
Scatter Plots with Marginal Densities – An Example for Doing Exploratory Data Analysis with Tableau and R

Introduction One of the first stages in most data analysis projects is about exploring the data at hand. During this stage the analyst tries to get familiar with his dataset by looking at summary statistics, feature distributions and relationships be...

Read more »

Examining the Accuracy of Fantasy Football Projections with an Interactive Scatterplot in R

July 13, 2015
By
Examining the Accuracy of Fantasy Football Projections with an Interactive Scatterplot in R

In prior posts, we presented the accuracy of different analysts in projecting football players’ performance, finding that the average was more accurate than any individual analyst.  In this post, we present The post Examining the Accuracy of Fantasy Football Projections with an Interactive Scatterplot in R appeared first on Fantasy Football Analytics.

Read more »

RSiteCatalyst Version 1.4.4 Release Notes

July 13, 2015
By
RSiteCatalyst Version 1.4.4 Release Notes

It’s been about six months since the last RSiteCatalyst update, and this update is really just a single bug fix, but a big bug fix at that! Sparse Data = Opaque Error Messages Numerous people have reported receiving an error message from RSiteCatalyst similar to the following: 'names' attribute must be the same length

Read more »

RcppGSL 0.2.5

July 13, 2015
By

A new version of RcppGSL arrived on CRAN a couple of days ago. This package provides an interface from R to the GNU GSL using our Rcpp package. In the course of preparation for the higher-performance R via C++ course I gave in Zuerich last month, I o...

Read more »