Blog Archives

Targeted Learning R Packages for Causal Inference and Machine Learning

March 31, 2015
By
Targeted Learning R Packages for Causal Inference and Machine Learning

by Sherri Rose Assistant Professor of Health Care Policy Harvard Medical School Targeted learning methods build machine-learning-based estimators of parameters defined as features of the probability distribution of the data, while also providing influence-curve or bootstrap-based confidence internals. The theory offers a general template for creating targeted maximum likelihood estimators for a data structure, nonparametric or semiparametric statistical model,...

Read more »

Review of "Hands-On Programming with R"

March 26, 2015
By

by Joseph Rickert There have been well over a hundred books on R published within the last ten years. Most of these texts with titles like “Introduction Statistics with R” or “Time Series with R” offer the reader a way to jump right in and perform some concrete statistical analysis using R’s myriad built-in functions and extensive visualization features....

Read more »

R Package ‘smbinning': Optimal Binning for Scoring Modeling

March 24, 2015
By
R Package ‘smbinning': Optimal Binning for Scoring Modeling

by Herman Jopia What is Binning? Binning is the term used in scoring modeling for what is also known in Machine Learning as Discretization, the process of transforming a continuous characteristic into a finite number of intervals (the bins), which allows for a better understanding of its distribution and its relationship with a binary variable. The bins generated by...

Read more »

A first look at rxBTrees

March 19, 2015
By
A first look at rxBTrees

by Joseph Rickert The gradient boosting machine as developed by Friedman, Hastie, Tibshirani and others, has become an extremely successful algorithm for dealing with both classification and regression problems and is now an essential feature of any machine learning toolbox. R’s gbm() function (gbm package) is a particularly well crafted implementation of the gradient boosting machine that served as...

Read more »

Some thoughts on Vim

March 17, 2015
By
Some thoughts on Vim

by Gary R. Moser Director of Institutional Research and Planning The California Maritime Academy I recently contacted Joseph Rickert about inviting Vim guru Drew Niel (web: vimcasts.org, book: "Practical Vim: Edit Text at the Speed of Thought") to speak at the Bay Area R User Group group. Due to Drew's living in Great Britain that might not be easily...

Read more »

A Monte Carlo Simulation for Pi Day

March 12, 2015
By
A Monte Carlo Simulation for Pi Day

by Joseph Rickert What will you be doing at 26 minutes and 53 seconds past 9 this coming Saturday morning? I will probably be running simulations. I have become obsessed with an astounding result from number theory and have been trying to devise Monte Carlo simulations to get at it. The result, well known to number theorists says: choose...

Read more »

R User Group Activity

March 5, 2015
By
R User Group Activity

by Joseph Rickert R user group activity is still on the rise. The following plot of the number of R user group meetings listed on Revolution Analytics' Community Calendar over the most recent 114 weeks shows a slight to upward trend along with a couple of annual cycles. Predictably, meetings trail off in the summer months and again late...

Read more »

Plotly Graphs with Domino’s New R Notebook

March 3, 2015
By
Plotly Graphs with Domino’s New R Notebook

by Matt Sundquist co-founder of Plotly Domino's new R Notebook and Plotly's R API let you code, make interactive R and ggplot2 graphs, and collaborate entirely online. Here is the Notebook in action: Published R Notebook To execute this Notebook, or to build your own, head to Domino's Plotly Project. The GIF below shows how to get started: choose...

Read more »

Collaborative Computing with distcomp

February 26, 2015
By
Collaborative Computing with distcomp

by Joseph Rickert Distcomp, a new R package available on GitHub from a group of Stanford researchers has the potential to significantly advance the practice of collaborative computing with large data sets distributed over separate sites that may be unwilling to explicitly share data. The fundamental idea is to be able to rapidly set up a web service based...

Read more »

Some R Conferences in 2015

February 19, 2015
By

by Joseph Rickert For the past few years, the Strata + Hadoop World Conference in San Jose has kicked off my personal conference season. With its focus on Data Science, Strata always seems to present some interesting R related talks, and I am looking forward to the various events over the next couple of days. But, Strata and other...

Read more »