Blog Archives

Databases in containers

February 8, 2016
By
Databases in containers

A great number of readers reacted very positively to Nina Zumel‘s article Using PostgreSQL in R: A quick how-to. Part of the reason is she described an incredibly powerful data science pattern: using a formerly expensive permanent system infrastructure as a simple transient tool. In her case the tools were the data manipulation grammars SQL … Continue reading...

Read more »

Free video course: applied Bayesian A/B testing in R

February 4, 2016
By
Free  video course: applied Bayesian A/B testing in R

As a “thank you” to our blog, mailing list, and Twitter followers (@WinVectorLLC) we at Win-Vector LLC have decided to re-release our formerly fee-based A/B testing video course as a free (advertisement supported) video course here on Youtube. The course emphasizes how to design A/B tests using prior “guestimates” of effect sizes (often you have … Continue reading...

Read more »

Shiny Developer Conference

January 31, 2016
By
Shiny Developer Conference

Really enjoying RStudio‘s Shiny Developer Conference | Stanford University | January 2016. Winston Chang just demonstrated profvis, really slick. You can profile code just by wrapping it in a profvis({}) block and the results are exported as interactive HTML widgets. For example, running the R code below: if(!('profvis' %in% rownames(installed.packages()))) { devtools::install_github('rstudio/profvis') } library('profvis') nrow … Continue reading...

Read more »

Running R jobs quickly on many machines

January 22, 2016
By
Running R jobs quickly on many machines

As we demonstrated in “A gentle introduction to parallel computing in R” one of the great things about R is how easy it is to take advantage of parallel processing capabilities to speed up calculation. In this note we will show how to move from running jobs multiple CPUs/cores to running jobs multiple machines (for … Continue reading...

Read more »

Win-Vector data science mailing list (and a give-away!)

January 20, 2016
By

Win-Vector LLC is starting a data science mailing list that we would like you to sign up for. It is going to be a (deliberately infrequent) set of updates including Win-Vector LLC notices, upcoming speaking events, and data science products. To kick this off we will be awarding 5 free permanent subscriptions to our video … Continue reading...

Read more »

Prepping Data for Analysis using R

January 20, 2016
By
Prepping Data for Analysis using R

Nina and I are proud to share our lecture: “Prepping Data for Analysis using R” from ODSC West 2015. Nina Zumel and John Mount ODSC WEST 2015 It is about 90 minutes, and covers a lot of the theory behind the vtreat data preparation library. We also have a Github repository including all the lecture … Continue reading...

Read more »

A gentle introduction to parallel computing in R

January 18, 2016
By
A gentle introduction to parallel computing in R

Let’s talk about the use and benefits of parallel computation in R. IBM’s Blue Gene/P massively parallel supercomputer (Wikipedia). Parallel computing is a type of computation in which many calculations are carried out simultaneously.” Wikipedia quoting: Gottlieb, Allan; Almasi, George S. (1989). Highly parallel computing The reason we care is: by making the computer work … Continue reading...

Read more »

Nina Zumel and John Mount part of R Day at Strata + Hadoop World in San Jose 2016

January 17, 2016
By

Nina Zumel and I are honored to have been invited to be part of Strata + Hadoop World in San Jose 2016 R Day organized by RStudio and O’Reilly. We have written a lot on the topic of model validation in R and we are very excited to distill it down to an exciting tutorial. … Continue reading...

Read more »

Using Excel versus using R

January 15, 2016
By

Here is a video I made showing how R should not be considered “scarier” than Excel to analysts. One of the takeaway points: it is easier to email R procedures than Excel procedures. Win-Vector’s John Mount shows a simple analysis both in Excel and in R.

Read more »

Some programming language theory in R

January 1, 2016
By
Some programming language theory in R

Let’s take a break from statistics and data science to think a bit about programming language theory, and how the theory relates to the programming language used in the R analysis platform (the language is technically called “S”, but we are going to just call the whole analysis system “R”). Our reasoning is: if you … Continue reading...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)