Articles by John Mount

Free data science video lecture: debugging in R

April 9, 2016 | John Mount

We are pleased to release a new free data science video lecture: Debugging R code using R, RStudio and wrapper functions. In this 8 minute video we demonstrate the incredible power of R using wrapper functions to catch errors for later reproduction and debugging. If you haven’t tried these techniques ... [Read more...]

A bit on the F1 score floor

April 2, 2016 | John Mount

At Strata+Hadoop World “R Day” Tutorial, Tuesday, March 29 2016, San Jose, California we spent some time on classifier measures derived from the so-called “confusion matrix.” We repeated our usual admonition to not use “accuracy” as a project goal (business people tend to ask for it as it is the word ...
[Read more...]

WVPlots: example plots in R using ggplot2

April 1, 2016 | John Mount

Nina Zumel and I have been working on packaging our favorite graphing techniques in a more reusable way that emphasizes the analysis task at hand over the steps needed to produce a good visualization. The idea is: we sacrifice some of the flexibility and composability inherent to ggplot2 in R ...
[Read more...]

Upcoming Win-Vector LLC appearances

March 23, 2016 | John Mount

Win-Vector LLC will be presenting on statistically validating models using R and data science at: Strata+Hadoop World “R Day” Tutorial 9:00am–5:00pm Tuesday, March 29 2016, San Jose, California. ODSC San Francisco Meetup, 6:30pm-9:00pm Thursday, March 31, 2016, San Francisco, California. We will share code and examples. Registration required (and Strata is ...
[Read more...]

sample(): “Monkey’s Paw” style programming in R

March 22, 2016 | John Mount

The R functions base::sample and base::sample.int are functions that include extra “conveniences” that seem to have no purpose beyond encouraging grave errors. In this note we will outline the problem and a suggested work around. Obviously the R developers are highly skilled people with good intent, and ...
[Read more...]

More on preparing data

March 18, 2016 | John Mount

The Microsoft Data Science User Group just sponsored Nina Zumel‘s presentation “Preparing Data for Analysis Using R”. Microsoft saw Win-Vector LLC‘s ODSC West 2015 presentation “Prepping Data for Analysis using R” and generously offered to sponsor improving it and disseminating it to a wider audience. We feel Nina really ...
[Read more...]

Bend or break: strings in R

March 10, 2016 | John Mount

A common complaint from new users of R is: the string processing notation is ugly. Using paste(,,sep='') to concatenate strings seems clumsy. You are never sure which regular expression dialect grep()/gsub() are really using. Remembering the difference between length() and nchar() is initially difficult. As always things ...
[Read more...]

Win-Vector video courses: price/status changes

March 2, 2016 | John Mount

Win-Vector LLC has been offering a couple of online video courses on the topics of data science and A/B testing (both using R). These are high quality courses and well worth the money and time needed to work through them closely (with all materials distributed on GitHub). Our current ... [Read more...]

More Shiny user showcase demonstrations

February 24, 2016 | John Mount

We at Win-Vector LLC are very proud to announce that RStudio just inducted two more of our demonstration Shiny applications into their Shiny User Showcase gallery. Checkout the gallery to see our demonstrations of: Finding the k in k-means A/B test interactive design and analysis tool The geometry of ...
[Read more...]

Databases in containers

February 8, 2016 | John Mount

A great number of readers reacted very positively to Nina Zumel‘s article Using PostgreSQL in R: A quick how-to. Part of the reason is she described an incredibly powerful data science pattern: using a formerly expensive permanent system infrastructure as a simple transient tool. In her case the tools ...
[Read more...]

Free video course: applied Bayesian A/B testing in R

February 4, 2016 | John Mount

As a “thank you” to our blog, mailing list, and Twitter followers (@WinVectorLLC) we at Win-Vector LLC have decided to re-release our formerly fee-based A/B testing video course as a free (advertisement supported) video course here on Youtube. The course emphasizes how to design A/B tests using prior “...
[Read more...]

Shiny Developer Conference

January 31, 2016 | John Mount

Really enjoying RStudio‘s Shiny Developer Conference | Stanford University | January 2016. Winston Chang just demonstrated profvis, really slick. You can profile code just by wrapping it in a profvis({}) block and the results are exported as interactive HTML widgets. For example, running the R code below: if(!('profvis' %in% rownames(installed....
[Read more...]

Running R jobs quickly on many machines

January 22, 2016 | John Mount

As we demonstrated in “A gentle introduction to parallel computing in R” one of the great things about R is how easy it is to take advantage of parallel processing capabilities to speed up calculation. In this note we will show how to move from running jobs multiple CPUs/cores ...
[Read more...]

Win-Vector data science mailing list (and a give-away!)

January 20, 2016 | John Mount

Win-Vector LLC is starting a data science mailing list that we would like you to sign up for. It is going to be a (deliberately infrequent) set of updates including Win-Vector LLC notices, upcoming speaking events, and data science products. To kick this off we will be awarding 5 free permanent ... [Read more...]

Prepping Data for Analysis using R

January 20, 2016 | John Mount

Nina and I are proud to share our lecture: “Prepping Data for Analysis using R” from ODSC West 2015. Nina Zumel and John Mount ODSC WEST 2015 It is about 90 minutes, and covers a lot of the theory behind the vtreat data preparation library. We also have a Github repository including all ...
[Read more...]

Using Excel versus using R

January 15, 2016 | John Mount

Here is a video I made showing how R should not be considered “scarier” than Excel to analysts. One of the takeaway points: it is easier to email R procedures than Excel procedures. Win-Vector’s John Mount shows a simple analysis both in Excel and in R. [Read more...]

Some programming language theory in R

January 1, 2016 | John Mount

Let’s take a break from statistics and data science to think a bit about programming language theory, and how the theory relates to the programming language used in the R analysis platform (the language is technically called “S”, but we are going to just call the whole analysis system “... [Read more...]
1 16 17 18 19 20 22

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)