Blog Archives

R User Groups Continue to Grow

April 1, 2013
By
R User Groups Continue to Grow

by Joseph Rickert R user groups seem to be sprouting all over. Since last September we have noticed ten new groups worldwide: Auckland, New Zealand: Auckland-R-Users-Group (AKLRUG) had 33 people attend their March 8th meeting Chang Mai Thailand: Chang Mai is the first R user group in Thailand Durban, South Africa: The Durban R User Group is looking forward...

Read more »

Lots of data != "Big Data"

March 28, 2013
By
Lots of data != "Big Data"

by Joseph Rickert When talking with data scientists and analysts — who are working with large scale data analytics platforms such as Hadoop — about the best way to do some sophisticated modeling task it is not uncommon for someone to say, "We have all of the data. Why not just use it all?" This sort of comment often...

Read more »

R’s Garden of Probability Distributions

March 21, 2013
By
R’s Garden of Probability Distributions

by Joseph Rickert If you type ?Distributions at the R console you get a list of the 21 probability distributions included in the stats package that ships with base R. The same list appears in the Introduction to R Manual on CRAN and in most of the many fine introductory books available for the R language. These are indeed...

Read more »

Data Science Education gets personal

March 14, 2013
By

by Joseph B. Rickert It is difficult to imagine that there is anyone on the planet with an internet connection and a desire to learn something new who has not at least looked into taking a massive open online course (MOOC). Last Fall, in an 11/4/12 article, the New York Time declared the Year of the MOOC and quoted...

Read more »

A Review of the R Graphics Cookbook

February 11, 2013
By
A Review of the R Graphics Cookbook

A common criticism of R, especially from data scientists who are new to R but proficient in multiple programming languages, is that R is “quirky” and annoying because there is almost always more than one way to do simple things. I usually counter that they are trying to say that R is “flexible” and “rich”, but by the time...

Read more »

Benchmarking bigglm

November 13, 2012
By

By Joseph Rickert In a recent blog post, David Smith reported on a talk that Steve Yun and I gave at STRATA in NYC about building and benchmarking Poisson GLM models on various platforms. The results presented showed that the rxGlm function from Revolution Analytics’ RevoScaleR package running on a five node cluster outperformed a Map Reduce/ Hadoop implementation...

Read more »

Simulating the Birthday Problem with data derived probabilities

June 6, 2012
By
Simulating the Birthday Problem with data derived probabilities

You've probably heard of the Birthday Paradox: it only takes a small gathering of people before it's quite likely that two of them share the same birthday. You can solve the problem analytically or with simulation, but usually in either case simplifying assumptions are made (no-one born on February 29, for example). Joe Rickert uses Revolution R Enterprise 6...

Read more »

Simple tools for building a recommendation engine

April 19, 2012
By

By Joseph Rickert Revolution’s resident economist, Saar Golde, is very fond of saying that “90% of what you might from a recommendation engine can be achieved with simple techniques”. To illustrate this point (without doing a lot of work), we downloaded the million row movie dataset from www.grouplens.org with the idea of just taking the first obvious exploratory step:...

Read more »

Coefplot: New Package for Plotting Model Coefficients

January 3, 2012
By
Coefplot: New Package for Plotting Model Coefficients

By Joseph Rickert Even to the practiced eye, looking at coefficients in R model summaries can be tedious. And, capturing information about the significance of coefficients from scores or maybe even hundreds of models in a way that makes writing the final report a bit easier is a time consuming and thankless task. Of course, once you know what...

Read more »

Review of ‘R in Action’ by Robert I. Kabacoff

December 20, 2011
By
Review of ‘R in Action’ by Robert I. Kabacoff

By Joseph Rickert Yesterday, the cosmic randomizer placed me next to a newly minter lawyer in a crowed Los Gatos coffee shop. In three minutes of conversation I learned that that the fellow was interested in corporate law, was about to take a job that would give him a seat in the great VC/start-up game and that he had...

Read more »