Blog Archives

Some Impressions from R Finance 2015

June 4, 2015
By
Some Impressions from R Finance 2015

by Joseph Rickert The R/Finance 2015 Conference wrapped up last Saturday at UIC. It has been seven years already, but R/Finance still has the magic! - mostly very high quality presentations and the opportunity to interact and talk shop with some of the most accomplished R developers, financial modelers and even a few industry legends such as Emanuel Derman...

Read more »

Using Azure as an R datasource: Part 2 – Pulling data from MySQL/MariaDB

June 2, 2015
By
Using Azure as an R datasource: Part 2 – Pulling data from MySQL/MariaDB

by Gregory Vandenbrouck Software Engineer, Mirosoft This post is the second in a series that covers pulling data from various Windows Azure hosted storage solutions (such as MySQL, or Microsoft SQL Server) to an R client on Windows or Linux. Last time we covered pulling data from SQL Azure to an R client on Windows. This time we’ll be...

Read more »

RevoScaleR’s Naive Bayes Classifier rxNaiveBayes()

May 28, 2015
By
RevoScaleR’s Naive Bayes Classifier rxNaiveBayes()

by Joseph Rickert, Because of its simplicity and good performance over a wide spectrum of classification problems the Naïve Bayes classifier ought to be on everyone's short list of machine learning algorithms. Now, with version 7.4 we have a high performance Naïve Bayes classifier in Revolution R Enterprise too. Like all Parallel External Memory Algorithms (PEMAs) in the RevoScaleR...

Read more »

Situational Baseball: Analyzing Runs Potential Statistics

May 26, 2015
By

By Mark Malter A few weeks ago, I wrote about my Baseball Stats R shiny application, where I demonstrated how to calculate runs expectancies based on the 24 possible bases/outs states for any plate appearance. In this article, I’ll explain how I expanded on that to calculate the probability of winning the game, based on the current score/inning/bases/outs state....

Read more »

First Day Highlights from the Extremely Large Databases Conference

May 21, 2015
By
First Day Highlights from the Extremely Large Databases Conference

by Joseph Rickert The 8th XLDB (Extremely Large Databases) Conference open at Stanford on Tuesday with an outstanding program. This conference has been providing leadership in the "Big Data" world since its first workshop which was held in 2007. For example, the summary report for that year notes: "Both communities (industry and science) are moving towards parallel ... architectures...

Read more »

Fast parallel computing with Intel Phi coprocessors

May 19, 2015
By
Fast parallel computing with Intel Phi coprocessors

by Andrew Ekstrom Recovering physicist, applied mathematician and graduate student in applied Stats and systems engineering We know that R is a great system for performing statistical analysis. The price is quite nice too ;-) . As a graduate student, I need a cheap replacement for Matlab and/or Maple. Well, R can do that too. I’m running a large...

Read more »

A first look at htmlwidgets

May 14, 2015
By

by Joseph Rickert A strong case can be made that base R graphics supplemented with either the lattice library or ggplot2 for plotting by subgroups provides everything a statistician might need for both exploratory data analysis and for developing clear, crisp for communicating results. However, it is abundantly clear that web based graphics, driven to a large extent by...

Read more »

Using Azure as an R data source, Part 1

May 12, 2015
By
Using Azure as an R data source, Part 1

by Gregory Vandenbrouck Software Engineer at Microsoft This post is the first in a series that covers pulling data from various Windows Azure hosted storage solutions (such as MySQL, or Microsoft SQL Server) to an R client on Windows or Linux. We’ll start with a relatively simple case of pulling data from SQL Azure to an R client on...

Read more »

Digging up embedded plots

May 7, 2015
By
Digging up embedded plots

by Joseph Rickert The following multi-panel graph, which graces the cover of the most recent issue of the Journal of Computational and Graphical Statistics ,JCGS, (Vol 24, Num 1, March 2015) is from the paper by Grolemund and Wickham entitled Visualizing Complex Data With Embedded Plots. The four plots are noteworthy for a couple or reasons: They present superb...

Read more »

Data Science in HR

May 5, 2015
By
Data Science in HR

by Joseph Rickert Last year in a post on interesting R topics presented at the JSM I described how data scientists in Google's human resources department were using R and predictive analytics to better understand the characteristics of its workforce. Google may very well have done the pioneering work, but predictive analytics for HR applications is going mainstream. In...

Read more »