Articles by John Mount

Data Manipulation Corner Cases

March 10, 2019 | John Mount

Let’s try some "ugly corner cases" for data manipulation in R. Corner cases are examples where the user might be running to the edge of where the package developer intended their package to work, and thus often where things can go wrong. Let’s see what happens when we ...

[Read more...]

Starting With Data Science: A Rigorous Hands-On Introduction to Data Science for Engineers

March 6, 2019 | John Mount

Starting With Data Science A rigorous hands-on introduction to data science for engineers. Win Vector LLC is now offering a 4 day on-site intensive data science course. The course targets engineers familiar with Python and introduces them to the basics of current data science practice. This is designed as an interactive ... [Read more...]

rquery Substitution

March 2, 2019 | John Mount

The rquery R package has several places where the user can ask for what they have typed in to be substituted for a name or value stored in a variable. This becomes important as many of the rquery commands capture column names from un-executed code. So knowing if something is ... [Read more...]

Binning Data in a Database

February 28, 2019 | John Mount

Roz King just wrote an interesting article on binning data (a common data analytics step) in a database. He compares a case-based approach (where the bin divisions are stuffed into code) with a join based approach. He shares code and timings. Best of all: rquery gets some attention and turns ...

[Read more...]

“If You Were an R Function, What Function Would You Be?”

February 26, 2019 | John Mount

We’ve been getting some good uptake on our piping in R article announcement. The article is necessarily a bit technical. But one of its key points comes from the observation that piping into names is a special opportunity to give general objects the following personality quiz: “If you were ... [Read more...]

R Journal Volume 10/2, December 2018 is out!

February 25, 2019 | John Mount

We forgot to say: R Journal Volume 10/2, December 2018 is out! A huge thanks to the editors who work very hard to make this possible. And big “thank you” to the editors, referees, and journal for helping improve, and for including our note on pipes in R.

[Read more...]

More on Macros in R

February 23, 2019 | John Mount

Recently ran into something interesting in the R macros/quasi-quotation/substitution/syntax front: Romain François: “.@_lionelhenry reveals planned double curly syntax At #satRdayParis as a possible replacement, addition to !! and enquo()” It appears !! is no longer the last word in substitution (it certainly wasn’t the first). The described ...

[Read more...]

Getting Started With rquery

February 20, 2019 | John Mount

To make getting started with rquery (an advanced query generator for R) easier we have re-worked the package README for various data-sources (including SparkR!). Here are our current examples: rquery and MonetDBLite rquery and RPostgreSQL rquery and RSQLite rquery and SparkR rquery and sparklyr For the MonetDBLite the query diagrammer ...

[Read more...]

Playing With Pipe Notations

February 19, 2019 | John Mount

Recently Hadley Wickham prescribed pronouncing the magrittr pipe as “then” and using right-assignment as follows: I am not sure if it is a good or bad idea. But let’s play with it a bit, and perhaps readers can submit their experience and opinions in the comments section. Right assignment ...

[Read more...]

Query Generation in R

February 16, 2019 | John Mount

R users have been enjoying the benefits of SQL query generators for quite some time, most notably using the dbplyr package. I would like to talk about some features of our own rquery query generator, concentrating on derived result re-use. Introduction SQL represents value use by nesting. To use a ...

[Read more...]

PDSwR2 Free Excerpt and New Discount Code

February 14, 2019 | John Mount

Manning has a new discount code and a free excerpt of our book Practical Data Science with R, 2nd Edition: here. This section is elementary, but things really pick up speed as later on (also available in a paid preview). [Read more...]

cdata Control Table Keys

February 11, 2019 | John Mount

In our cdata R package and training materials we emphasize the record-oriented thinking and how to design a transform control table. We now have an additional exciting new feature: control table keys. The user can now control which columns of a cdata control table are the keys, including now using ... [Read more...]

Function Objects and Pipelines in R

February 3, 2019 | John Mount

Composing functions and sequencing operations are core programming concepts. Some notable realizations of sequencing or pipelining operations include: Unix’s |-pipe CMS Pipelines. F#‘s forward pipe operator |__. Haskel’s Data.Function & operator. The R magrittr forward pipe. Scikit-learn‘s sklearn.pipeline.Pipeline. The idea is: many important calculations can ... [Read more...]

Fully General Record Transforms with cdata

January 20, 2019 | John Mount

One of the design goals of the cdata R package is that very powerful and arbitrary record transforms should be convenient and take only one or two steps. In fact it is the goal to take just about any record shape to any other in two steps: first convert to ... [Read more...]

Make Teaching R Quasi-Quotation Easier

January 17, 2019 | John Mount

To make teaching R quasi-quotation easier it would be nice if R string-interpolation and quasi-quotation both used the same notation. They are related concepts. So some commonality of notation would actually be clarifying, and help teach the concepts. We will define both of the above terms, and demonstrate the relation ...

[Read more...]

R Tip: Use Inline Operators For Legibility

January 14, 2019 | John Mount

R Tip: use inline operators for legibility. A Python feature I miss when working in R is the convenience of Python‘s inline + operator. In Python, + does the right thing for some built in data types: It concatenates lists: [1,2] + [3] is [1, 2, 3]. It concatenates strings: 'a' + 'b' is 'ab'. … Continue reading R ... [Read more...]

Practical Data Science with R, 2nd Edition discount!

January 12, 2019 | John Mount

Please help share our news and this discount. The second edition of our best-selling book Practical Data Science with R2, Zumel, Mount is featured as deal of the day at Manning. The second edition isn’t finished yet, but chapters 1 through 4 are available in the Manning Early Access Program (MEAP), ...

[Read more...]

R Tip: Use seqi() For Indexes

January 11, 2019 | John Mount

R Tip: use seqi() for indexing. R‘s “1:0 trap” is a mal-feature that confuses newcomers and is a reliable source of bugs. This note will show how to use seqi() to write more reliable code and document intent. The issue is, contrary to expectations (formed in working with other programming ... [Read more...]

A Beautiful 2 by 2 Matrix Identity

January 8, 2019 | John Mount

While working on a variation of the RcppDynProg algorithm we derived the following beautiful identity of 2 by 2 real matrices: The superscript “top” denoting the transpose operation, the ||.||^2_2 denoting sum of squares norm, and the single |.| denoting determinant. This is derived from one of the check equations for the Moore–Penrose ...

[Read more...]

Timing the Same Algorithm in R, Python, and C++

January 6, 2019 | John Mount

While developing the RcppDynProg R package I took a little extra time to port the core algorithm from C++ to both R and Python. This means I can time the exact same algorithm implemented nearly identically in each of these three languages. So I can extract some comparative “apples to ... [Read more...]

« 1 … 5 6 7 8 9 … 24 »

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Articles by John Mount

Data Manipulation Corner Cases

Starting With Data Science: A Rigorous Hands-On Introduction to Data Science for Engineers

rquery Substitution

Binning Data in a Database

“If You Were an R Function, What Function Would You Be?”

R Journal Volume 10/2, December 2018 is out!

More on Macros in R

Getting Started With rquery

Playing With Pipe Notations

Query Generation in R

PDSwR2 Free Excerpt and New Discount Code

cdata Control Table Keys

Function Objects and Pipelines in R

Fully General Record Transforms with cdata

Make Teaching R Quasi-Quotation Easier

R Tip: Use Inline Operators For Legibility

Practical Data Science with R, 2nd Edition discount!

R Tip: Use seqi() For Indexes

A Beautiful 2 by 2 Matrix Identity

Timing the Same Algorithm in R, Python, and C++

Articles by John Mount

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)