Blog Archives

Diving into H2O

April 17, 2014
By
H2O

by Joseph Rickert One of the remarkable features of the R language is its adaptability. Motivated by R’s popularity and helped by R’s expressive power and transparency developers working on other platforms display what looks like inexhaustible creativity in providing seamless interfaces to software that complements R’s strengths. The H2O R package that connects to 0xdata’s H2O software (Apache...

Read more »

Quantitative Finance Applications in R – 5: an Introduction to Monte Carlo Simulation

April 15, 2014
By
Quantitative Finance Applications in R – 5: an Introduction to Monte Carlo Simulation

by Daniel Hanson Last time, we looked at the four-parameter Generalized Lambda Distribution, as a method of incorporating skew and kurtosis into an estimated distribution of market returns, and capturing the typical fat tails that the normal distribution cannot. Having said that, however, the Normal distribution can be useful in constructing Monte Carlo simulations, and it is still commonly...

Read more »

BARUG talks highlight R’s diverse applications

April 10, 2014
By
BARUG talks highlight R’s diverse applications

by Joseph Rickert The seven lightning talks presented to the Bay Area useR Group on Tuesday night were not only really interesting (in some cases downright entertaining) in their own right, but they also illustrated the diversity of R applications, and the extent to which R has become embedded in the corporate world. Two presentations with a whimsical touch...

Read more »

Norm Matloff: Mad(Data)Scientist

April 8, 2014
By
Norm Matloff: Mad(Data)Scientist

by Joseph Rickert Norman Matloff professor of computer science at UC Davis, and founding member of the UCD Dept. of Statistics has begun posting as Mad(Data)Scientist. (You may know Norm from his book, The Art of R Programming: NSP, 2011.) In his second post (out today) on the new R package, freqparcoord, that he wrote with Yinkang Xie, Norm...

Read more »

Ensemble Packages in R

April 8, 2014
By
Ensemble Packages in R

by Mike Bowles Mike Bowles is a machine learning expert and serial entrepreneur. This is the second post in what is envisioned as a four part series that began with Mike's Thumbnail History of Ensemble Models. One of the main reasons for using R is the vast array of high-quality statistical algorithms available in R. Ensemble methods provide a...

Read more »

Some R Resources for GLMs

April 3, 2014
By
Some R Resources for GLMs

by Joseph Rickert Generalized Linear Models have become part of the fabric of modern statistics, and logistic regression, at least, is a “go to” tool for data scientists building classification applications. The ready availability of good GLM software and the interpretability of the results logistic regression makes it a good baseline classifier. Moreover, Paul Komarek argues that, with a...

Read more »

A look at R vectorization through the Collatz Conjecture

April 1, 2014
By
A look at R vectorization through the Collatz Conjecture

by Seth Mottaghinejad, Analytic Consultant for Revolution Analytics You may have heard before that R is a vectorized language, but what do we mean by that? One way to read that is to say that many functions in R can operate efficiently on vectors (in addition to singletons). Here are some examples: > log(1) # input and output are...

Read more »

R User Group Activity for Q1 2014

March 27, 2014
By
R User Group Activity for Q1 2014

by Joseph Rickert Worldwide R user group activity for the first Quarter of 2014 appears to be way up compared to previous years as the following plot shows. The plot was built by counting the meetings on Revolution Analytics R Community Calendar. R users continue to value the live, in person events and face-to-face meetings with their peers. Moreover,...

Read more »

A Thumbnail History of Ensemble Methods

March 25, 2014
By
A Thumbnail History of Ensemble Methods

By Mike Bowles Ensemble methods are the backbone of machine learning techniques. However, it can be a daunting subject for someone approaching it for the first time, so we asked Mike Bowles, machine learning expert and serial entrepreneur to provide some context. Ensemble Methods are among the most powerful and easiest to use of predictive analytics algorithms and R...

Read more »

Data Sets for Data Science

March 20, 2014
By
Data Sets for Data Science

by Joseph Rickert Recently, I had the opportunity to be a member of a job panel for Mathematics, Economics and Statistics students at my alma mater, CSUEB (California State University East Bay). In the context of preparing for a career in data science a student at the event asked: “Where can I find good data sets?”. This triggered a...

Read more »