Blog Archives

Partying R Style with Sqor Sports, R on Azure, and data.table

November 20, 2014
By
Partying R Style with Sqor Sports, R on Azure, and data.table

by Joseph Rickert We usually have a pretty good time at the monthly Bay Area useR Group (BARUG) meetings, but this month's meeting was a bit more of a party than usual. The very well connected PR team at Sqor Sports, our host company for the evening, secured San Francisco's tres trendy 111 Minna Gallery for the venue. There...

Read more »

Introduction to Revolution R Enterprise For Big Data Analytics on datacamp.com

November 18, 2014
By

by Jeremy Reynolds Senior R Trainer, Revolution Analytics Last week, Revolution Analytics released its first massive open, online course through a partnership with datacamp.com: Introduction to Revolution R Enterprise for Big Data Analytics. You can sign up for the free course here. This course provides a look at some of the tools provided by the RevoScaleR package that ships...

Read more »

A look at the igraph package

November 13, 2014
By
A look at the igraph package

by Joseph Rickert The igraph package has become a fundamental tool for the study of graphs and their properties, the manipulation and visualization of graphs and the statistical analysis of networks. To get an idea of just how firmly igraph has become embedded into the R package ecosystem consider that currently igraph lists 72 reverse depends, 59 reverse imports...

Read more »

3D Plots with ggplot2 and Plotly

November 11, 2014
By
3D Plots with ggplot2 and Plotly

by Matt Sundquist Plotly, co-founder Plotly is a platform for data analysis, graphing, and collaboration. You can use ggplot2, Plotly's R API, and Plotly's web app to make and share interactive plots. Now, you can you can also make 3D plots. Immediately below are a few examples of 3D plots. In this post we will show how to make...

Read more »

Looking into a very messy data set

November 6, 2014
By
Looking into a very messy data set

by Joseph Rickert I recently had the opportunity to look at the data used for the 2009 KDD Cup competition. There are actually two sets of files that are still available from this competition. The "large" file is a series of five .csv files that when concatenated form a data set with 50,000 rows and 15,000 columns. The "small"...

Read more »

A Look at the World Values Survey

November 4, 2014
By
A Look at the World Values Survey

by Peggy Fan Ph.D. Candidate at Stanford's Graduate School of Education Part of my dissertation at Stanford Graduate School of Education, International Comparative Education program, is looking at the World Values Survey (WVS), a cross-national social survey that started in 1981. Since then there has been 6 waves, and the surveys include questions that capture the demographic, behaviors, personal...

Read more »

Some R Highlights from the Bay Area Data Science Camp and Unconference

October 30, 2014
By
Some R Highlights from the Bay Area Data Science Camp and Unconference

by Joseph Rickert The San Francisco Bay Area Chapter of the Association of Computing Machinery (ACM) has been holding an annual Data Mining Camp and "unconference" since 2009. This year, to reflect the times, the group held a Data Science Camp and unconference, and we at Revolution Analytics were, once again, very happy to be a sponsor for the...

Read more »

Type III tests and R

October 28, 2014
By

by Terry M. Therneau Ph.D. Faculty, Mayo Clinic About a year ago there was a query about how to do "type 3" tests for a Cox model on the R help list, which someone wanted because SAS does it. The SAS addition looked suspicious to me, but as the author of the survival package I thought I should understand...

Read more »

A first look at Distributed R

October 23, 2014
By
A first look at Distributed R

by Joseph Rickert One of the most interesting R related presentations at last week’s Strata Hadoop World Conference in New York City was the session on Distributed R by Sunil Venkayala and Indrajit Roy, both of HP Labs. In short, Distributed R is an open source project with the end goal of running R code in parallel on data...

Read more »

The Generalized Lambda Distribution and GLDEX Package for Fitting Financial Return Data – Part 2

October 14, 2014
By
The Generalized Lambda Distribution and GLDEX Package for Fitting Financial Return Data – Part 2

Part 2 of a series by Daniel Hanson, with contributions by Steve Su (author of the GLDEX package) Recap of Part 1 In our previous article, we introduced the four-parameter Generalized Lambda Distribution (GLD) and looked at fitting a 20-year set of returns from the Wilshire 5000 Index, comparing the results of two methods, namely the Method of Moments,...

Read more »