Blog Archives

Announcing the wrapr packge for R

February 11, 2017
By
Announcing the wrapr packge for R

Recently Dirk Eddelbuettel pointed out that our R function debugging wrappers would be more convenient if they were available in a low-dependency micro package dedicated to little else. Dirk is a very smart person, and like most R users we are deeply in his debt; so we (Nina Zumel and myself) listened and immediately moved … Continue...

Read more »

My recent BARUG talk: Parametric Programming in R with replyr

February 9, 2017
By

I want to share an edited screencast of my rehearsal for my recent San Francisco Bay Area R Users Group talk:

Read more »

Evolving R Tools and Practices

February 5, 2017
By
Evolving R Tools and Practices

One of the distinctive features of the R platform is how explicit and user controllable everything is. This allows the style of use of R to evolve fairly rapidly. I will discuss this and end with some new notations, methods, and tools I am nominating for inclusion into your view of the evolving “current best … Continue...

Read more »

Going to Strata / Hadoop World 2017 San Jose?

February 2, 2017
By

Are you attending or considering attending Strata / Hadoop World 2017 San Jose? Are you interested in learning to use R to work with Spark and h2o? Then please consider signing up for my 3 1/2 hour workshop soon. We are about half full now, but I really want to fill the room, while making … Continue...

Read more »

Using the Bizarro Pipe to Debug magrittr Pipelines in R

January 29, 2017
By
Using the Bizarro Pipe to Debug magrittr Pipelines in R

I have just finished and released a new R video lecture demonstrating how to use the “Bizarro pipe” to debug magrittr pipelines. I think R dplyr users will really enjoy it. Please read on for the link to the video lecture. In this video lecture I use the “Bizarro pipe” to debug the example pipeline … Continue...

Read more »

Upcoming Win-Vector LLC public speaking engagements

January 26, 2017
By

I am happy to announce a couple of exciting upcoming Win-Vector LLC public speaking engagements. BARUG Meetup Tuesday, Tuesday February 7, 2017 ~7:50pm, Intuit, Building 20, 2600 Marine Way, Mountain View, CA. Win-Vector LLC’s John Mount will be giving a “lightning talk” (15 minutes) on R calling conventions (standard versus non-standard) and showing how to … Continue...

Read more »

Upgrading to macOS Sierra (nee OSX) for R users

January 26, 2017
By
Upgrading to macOS Sierra (nee OSX) for R users

A good fraction of R users use Apple computers. Apple machines historically have sat at a sweet spot of convenience, power, and utility: Convenience: Apple machines are available at retail stores, come with purchasable support, and can run a lot of common commercial software. Power: R packages such as parallel and Rcpp work better on … Continue...

Read more »

Why do Decision Trees Work?

January 6, 2017
By
Why do Decision Trees Work?

In this article we will discuss the machine learning method called “decision trees”, moving quickly over the usual “how decision trees work” and spending time on “why decision trees work.” We will write from a computational learning theory perspective, and hope this helps make both decision trees and computational learning theory more comprehensible. The goal … Continue...

Read more »

A Theory of Nested Cross Simulation

January 1, 2017
By
A Theory of Nested Cross Simulation

[Reader’s Note. Some of our articles are applied and some of our articles are more theoretical. The following article is more theoretical, and requires fairly formal notation to even work through. However, it should be of interest as it touches on some of the fine points of cross-validation that are quite hard to perceive or … Continue...

Read more »

Data Preparation, Long Form and tl;dr Form

December 26, 2016
By
Data Preparation, Long Form and tl;dr Form

Data preparation and cleaning are some of the most important steps of predictive analytic and data science tasks. They are laborious, where most of the errors are made, your last line of defense against a wild data, and hold the biggest opportunities for outcome improvement. No matter how much time you spend on then, they … Continue...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)