Blog Archives

Efficient accumulation in R

July 27, 2015
By
Efficient accumulation in R

R has a number of very good packages for manipulating and aggregating data (plyr, sqldf, ScaleR, data.table, and more), but when it comes to accumulating results the beginning R user is often at sea. The R execution model is a bit exotic so many R users are very uncertain which methods of accumulating results are … Continue reading...

Read more »

A dynamic programming solution to A/B test design

July 6, 2015
By

Our last article on A/B testing described the scope of the realistic circumstances of A/B testing in practice and gave links to different standard solutions. In this article we will be take an idealized specific situation allowing us to show a particularly beautiful solution to one very special type of A/B test. For this article … Continue reading...

Read more »

What is a good Sharpe ratio?

June 27, 2015
By

We have previously written that we like the investment performance summary called the Sharpe ratio (though it does have some limits). What the Sharpe ratio does is: give you a dimensionless score to compare similar investments that may vary both in riskiness and returns without needing to know the investor’s risk tolerance. It does this … Continue reading...

Read more »

A bit about Win-Vector LLC

June 26, 2015
By

Win-Vector LLC is a consultancy founded in 2007 that specializes in research, algorithms, data-science, and training. (The name is an attempt at a mathematical pun.) Win-Vector LLC can complete your high value project quickly (some examples), and train your data science team to work much more effectively. Our consultants include the authors of Practical Data … Continue reading...

Read more »

R in a 64 bit world

June 8, 2015
By
R in a 64 bit world

32 bit data structures (pointers, integer representations, single precision floating point) have been past their “best before date” for quite some time. R itself moved to a 64 bit memory model some time ago, but still has only 32 bit integers. This is going to get more and more awkward going forward. What is R … Continue reading...

Read more »

My favorite R bug

May 23, 2015
By
My favorite R bug

In this note am going to recount “my favorite R bug.” It isn’t a bug in R. It is a bug in some code I wrote in R. I call it my favorite bug, as it is easy to commit and (thanks to R’s overly helpful nature) takes longer than it should to find. The … Continue reading...

Read more »

What is new in the vtreat library?

May 7, 2015
By
What is new in the vtreat library?

The Win-Vector LLC vtreat library is a library we supply (under a GPL license) for automating the simple domain independent part of variable cleaning an preparation. The idea is you supply (in R) an example general data.frame to vtreat’s designTreatmentsC method (for single-class categorical targets) or designTreatmentsN method (for numeric targets) and vtreat returns a … Continue reading...

Read more »

What can be in an R data.frame column?

April 9, 2015
By
What can be in an R data.frame column?

As an R programmer have you every wondered what can be in a data.frame column? The documentation is a bit vague, help(data.frame) returns some comforting text including: Value A data frame, a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on). If you ask an R programmer … Continue reading...

Read more »

New video course: Campaign Response Testing

April 8, 2015
By
New video course: Campaign Response Testing

I am proud to announce a new Win-Vector LLC statistics video course: Campaign Response Testing John Mount, Win-Vector LLC This course works through the very specific statistics problem of trying to estimate the unknown true response rates one or more populations in responding to one or more sales/marketing campaigns or price-points. This is an old … Continue reading...

Read more »

How and why to return functions in R

April 3, 2015
By
How and why to return functions in R

One of the advantages of functional languages (such as R) is the ability to create and return functions “on the fly.” We will discuss one good use of this capability and what to look out for when creating functions in R. Why wrap/return functions? One of my favorite uses of “on the fly functions” is … Continue reading...

Read more »