Articles by John Mount

R Tip: Check What Repos You are Using

February 2, 2020 | John Mount

In a lot of our R writing we casually say “install from CRAN using install.packages('PKGNAME')” or “update your packages by using update.packages(ask = FALSE, checkBuilt = TRUE) (and answering ‘no’ to all questions about compiling).” We recently became aware that for some users this isn’t complete advice. ... [Read more...]

Data re-Shaping in R and in Python

January 28, 2020 | John Mount

Nina Zumel and I have a two new tutorials on fluid data wrangling/shaping. They are written in a parallel structure, with the R version of the tutorial being almost identical to the Python version of the tutorial. This reflects our opinion on the “which is better for data science ... [Read more...]

wrapr 1.9.6 is now up on CRAN

January 26, 2020 | John Mount

wrapr 1.9.6 is now up on CRAN. We unfortunately usually forget to say this. A big thank you to the staff and volunteers at CRAN. As part of this release Nina Zumel has streamlined the unpack vignette, picking and recommending specific notations for the unpack method. We are looking forward to ...
[Read more...]

Why we wrote wrapr to/unpack

January 22, 2020 | John Mount

One reason we are developing the wrapr to/unpack methods is the following: we wanted to spruce up the R vtreat interface a bit. We had recently back-ported a Python sklearn Pipeline step style interface from the Python vtreat to R (announcement here). But that doesn’t mean we are ... [Read more...]

unpack Your Values in R

January 20, 2020 | John Mount

I would like to introduce an exciting feature in the upcoming 1.9.6 version of the wrapr R package: value unpacking. The unpacking notation is made available if you install wrapr version 1.9.6 from Github: remotes::install_github("WinVector/wrapr") We will likely send this version to CRAN in a couple of weeks. ... [Read more...]

sklearn Pipe Step Interface for vtreat

January 14, 2020 | John Mount

We’ve been experimenting with this for a while, and the next R vtreat package will have a back-port of the Python vtreat package sklearn pipe step interface (in addition to the standard R interface). This means the user can express easily express modeling intent by choosing between coder$fit_... [Read more...]

New vtreat Feature: Nested Model Bias Warning

January 11, 2020 | John Mount

For quite a while we have been teaching estimating variable re-encodings on the exact same data they are later naively using to train a model on, leads to an undesirable nested model bias. The vtreat package (both the R version and Python version) both incorporate a cross-frame method that allows ... [Read more...]

Introduction to Data Science in R, Free for 3 days

December 30, 2019 | John Mount

To celebrate the new year and the recent release of Practical Data Science with R 2nd Edition, we are offering a free coupon for our video course “Introduction to Data Science.” The following URL and code should get you permanent free access to the video course, if used between now ... [Read more...]

What is a Second Edition?

December 24, 2019 | John Mount

What it is a second edition of a book to its authors? In some sense it is the book the authors thought they were writing the first time. With some good fortune a second edition can be much more than that. For our example: Nina and I received a lot ... [Read more...]

Why to try Practical Data Science with R, 2nd Edition

December 22, 2019 | John Mount

I thought we would try to express why somebody interested in using the R language (and package ecosystem) for supervised machine learning, data wrangling, analytics projects, and other data science topics should give Practical Data Science with R, 2nd Edition a try. Nina Zumel and I shared the book with ...
[Read more...]

What is new for rquery December 2019

December 7, 2019 | John Mount

Our goal has been to make rquery the best query generation system for R (and to make data_algebra the best query generator for Python). Lets see what rquery is good at, and what new features are making rquery better. The idea is: the query is a first class citizen ... [Read more...]
1 2 3 4 22

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)