Blog Archives

Running R jobs quickly on many machines

January 22, 2016
By
Running R jobs quickly on many machines

As we demonstrated in “A gentle introduction to parallel computing in R” one of the great things about R is how easy it is to take advantage of parallel processing capabilities to speed up calculation. In this note we will show how to move from running jobs multiple CPUs/cores to running jobs multiple machines (for … Continue reading...

Read more »

Win-Vector data science mailing list (and a give-away!)

January 20, 2016
By

Win-Vector LLC is starting a data science mailing list that we would like you to sign up for. It is going to be a (deliberately infrequent) set of updates including Win-Vector LLC notices, upcoming speaking events, and data science products. To kick this off we will be awarding 5 free permanent subscriptions to our video … Continue reading...

Read more »

Prepping Data for Analysis using R

January 20, 2016
By
Prepping Data for Analysis using R

Nina and I are proud to share our lecture: “Prepping Data for Analysis using R” from ODSC West 2015. Nina Zumel and John Mount ODSC WEST 2015 It is about 90 minutes, and covers a lot of the theory behind the vtreat data preparation library. We also have a Github repository including all the lecture … Continue reading...

Read more »

A gentle introduction to parallel computing in R

January 18, 2016
By
A gentle introduction to parallel computing in R

Let’s talk about the use and benefits of parallel computation in R. IBM’s Blue Gene/P massively parallel supercomputer (Wikipedia). Parallel computing is a type of computation in which many calculations are carried out simultaneously.” Wikipedia quoting: Gottlieb, Allan; Almasi, George S. (1989). Highly parallel computing The reason we care is: by making the computer work … Continue reading...

Read more »

Nina Zumel and John Mount part of R Day at Strata + Hadoop World in San Jose 2016

January 17, 2016
By

Nina Zumel and I are honored to have been invited to be part of Strata + Hadoop World in San Jose 2016 R Day organized by RStudio and O’Reilly. We have written a lot on the topic of model validation in R and we are very excited to distill it down to an exciting tutorial. … Continue reading...

Read more »

Using Excel versus using R

January 15, 2016
By

Here is a video I made showing how R should not be considered “scarier” than Excel to analysts. One of the takeaway points: it is easier to email R procedures than Excel procedures. Win-Vector’s John Mount shows a simple analysis both in Excel and in R.

Read more »

Some programming language theory in R

January 1, 2016
By
Some programming language theory in R

Let’s take a break from statistics and data science to think a bit about programming language theory, and how the theory relates to the programming language used in the R analysis platform (the language is technically called “S”, but we are going to just call the whole analysis system “R”). Our reasoning is: if you … Continue reading...

Read more »

An R function return and assignment puzzle

December 29, 2015
By

Here is an R programming puzzle. What does the following code snippet actually do? And ever harder: what does it mean? (See here for some material on the difference between what code does and what code means.) f <- function() { x <- 5 } f() In R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree" the … Continue reading...

Read more »

Practical Data Science with R examples

December 11, 2015
By

One of the big points of Practical Data Science with R is to supply a large number of fully worked examples. Our intent has always been for readers to read the book, and if they wanted to follow up on a data set or technique to find the matching worked examples in the project directory … Continue reading...

Read more »

Sequential Analysis

December 11, 2015
By
Sequential Analysis

We here at Win-Vector LLC been working through an ad-hoc series about A/B testing combining elements of both operations research and statistical points of view. A dynamic programming solution to A/B test design Why does designing a simple A/B test seem so complicated? A clear picture of power and significance in A/B tests Bandit Formulations … Continue reading...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series













Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)