Articles by John Mount

Could not Resist

April 29, 2019 | John Mount

Also, Practical Data Science with R, 2nd Edition; Zumel, Mount; Manning 2019 is now content complete! It is deep into editing and soon into production!

[Read more...]

John Mount, Nina Zumel; Win-Vector LLC 2019-04-27 In this note we will use five real life examples to demonstrate data layout transforms using the cdata R package. The examples for this note are all demo-examples from tidyr/demo/, and are mostly based on questions posted to StackOverflow. They represent ... [Read more...]

Practical Data Science with R Book Update (April 2019)

April 22, 2019 | John Mount

I thought I would give a personal update on our book: Practical Data Science with R 2nd edition; Zumel, Mount; Manning 2019. The second edition should be fully available this fall! Nina and I have finished up through chapter 10 (of 12), and Manning has released previews of up through chapter 7 (with more ... [Read more...]

Controlling Data Layout With cdata

April 16, 2019 | John Mount

Here is an example how easy it is to use cdata to re-layout your data. Tim Morris recently tweeted the following problem (corrected). Please will you take pity on me #rstats folks? I only want to reshape two variables x & y from wide to long! Starting with: d xa xb ...

[Read more...]

Piping is Method Chaining

April 14, 2019 | John Mount

What R users now call piping, popularized by Stefan Milton Bache and Hadley Wickham, is inline function application (this is notationally similar to, but distinct from the powerful interprocess communication and concurrency tool introduced to Unix by Douglas McIlroy in 1973). In object oriented languages this sort of notation for function ... [Read more...]

R Photo

April 10, 2019 | John Mount

A good friend is now a professor at the University of Auckland and knew to photograph and send us this. Thanks!!!

[Read more...]

Practical Data Science with R Book Update

April 8, 2019 | John Mount

A good friend shared with us a great picture of Practical Data Science with R, 1st Edition hanging out in Cambridge at the MIT Press Bookstore. This is as good an excuse as any to share a book update. Nina Zumel and I (John Mount) are busy revising chapters 10 and 11 ...

[Read more...]

Not Always C++’s Fault

April 6, 2019 | John Mount

From the recent developer.r-project.org “Staged Install” article: Incidentally, there were just two distinct (very long) lists of methods in the warnings across all installed packages in my run, but repeated for many packages. It turned out that they were lists of exported methods from dplyr and rlang packages. ... [Read more...]

Why RcppDynProg is Written in C++

April 5, 2019 | John Mount

The (matter of opinion) claim: “When the use of C++ is very limited and easy to avoid, perhaps it is the best option to do that […]” (source discussed here) got me thinking: does our own RcppDynProg package actually use C++ in a significant way? Could/should I port it to ...

[Read more...]

What are the Popular R Packages?

April 4, 2019 | John Mount

“R is its packages”, so to know R we should know its popular packages (CRAN). Or put it another way: as R is a typical “the reference implementation is the specification” programming environment there is no true “de jure” R, only a de facto R. To look at popular R ...

[Read more...]

C++ is Often Used in R Packages

April 3, 2019 | John Mount

The recent r-project article “Use of C++ in Packages” stated as its own summary of recommendation: don’t use C++ to interface with R. A careful reading of the article exposes at least two possible meanings of this: Don’t use C++ to directly call R or directly manipulate R ... [Read more...]

Standard Evaluation Versus Non-Standard Evaluation in R

April 2, 2019 | John Mount

There is a lot of unnecessary worry over “Non Standard Evaluation” (NSE) in R versus “Standard Evaluation” (SE, or standard “variables names refer to values” evaluation). This very author is guilty of over-discussing the issue. But let’s give this yet another try. The entire difference between NSE and regular ...

[Read more...]

Operator Notation for Data Transforms

March 25, 2019 | John Mount

As of cdata version 1.0.8 cdata implements an operator notation for data transform. The idea is simple, yet powerful. First let’s start with some data. d [Read more...]

How cdata Control Table Data Transforms Work

March 23, 2019 | John Mount

With all of the excitement surrounding cdata style control table based data transforms (the cdata ideas being named as the “replacements” for tidyr‘s current methodology, by the tidyr authors themselves!) I thought I would take a moment to describe how they work. cdata defines two primary data manipulation operators: ... [Read more...]

Why we Did Not Name the cdata Transforms wide/tall/long/short

March 22, 2019 | John Mount

We recently saw this UX (user experience) question from the tidyr author as he adapts tidyr to cdata techniques. The terminology that he is not adopting from cdata is “unpivot_to_blocks()” and “pivot_to_rowrecs()”. One of the research ideas in the cdata package is that the important thing ...

[Read more...]

Tidyverse users: gather/spread are on the way out

March 19, 2019 | John Mount

From https://twitter.com/sharon000/status/1107771331012108288: From https://tidyr.tidyverse.org/dev/articles/pivot.html: There are two important new features inspired by other R packages that have been advancing of reshaping in R: The reshaping operation can be specified with a data frame that describes precisely how metadata stored ...

[Read more...]

Quantifying R Package Dependency Risk

March 18, 2019 | John Mount

We recently commented on excess package dependencies as representing risk in the R package ecosystem. The question remains: how much risk? Is low dependency a mere talisman, or is there evidence it is a good practice (or at least correlates with other good practices)? Well, it turns out we can ...

[Read more...]

wrapr::let()

March 16, 2019 | John Mount

I would like to once again recommend our readers to our note on wrapr::let(), an R function that can help you eliminate many problematic NSE (non-standard evaluation) interfaces (and their associate problems) from your R programming tasks. The idea is to imitate the following lambda-calculus idea: let x be ... [Read more...]

Software Dependencies and Risk

March 15, 2019 | John Mount

Dirk Eddelbuettel just shared an important point on software and analyses: dependencies are hard to manage risks. If your software or research depends on many complex and changing packages, you have no way to establish your work is correct. This is because to establish the correctness of your work, you ...

[Read more...]

Unit Tests in R

March 13, 2019 | John Mount

I am collecting here some notes on testing in R. There seems to be a general (false) impression among non R-core developers that to run tests, R package developers need a test management system such as RUnit or testthat. And a further false impression that testthat is the only R ... [Read more...]

« 1 … 4 5 6 7 8 … 24 »

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Articles by John Mount

Could not Resist

Data Layout Exercises

Practical Data Science with R Book Update (April 2019)

Controlling Data Layout With cdata

Piping is Method Chaining

R Photo

Practical Data Science with R Book Update

Not Always C++’s Fault

Why RcppDynProg is Written in C++

What are the Popular R Packages?

C++ is Often Used in R Packages

Standard Evaluation Versus Non-Standard Evaluation in R

Operator Notation for Data Transforms

How cdata Control Table Data Transforms Work

Why we Did Not Name the cdata Transforms wide/tall/long/short

Tidyverse users: gather/spread are on the way out

Quantifying R Package Dependency Risk

wrapr::let()

Software Dependencies and Risk

Unit Tests in R

Articles by John Mount

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)