Blog Archives

Top 2 Packages for Newly Hired Data Scientists

July 9, 2015
By

 library(NewCo knowledge)function (X, FUN, ..., ) {FUN <-                                Read the business wires +                                Go to lunch with wide range of people +                                Read the 10-K and maybe 10-Q +                                Find a go-to source for “stupid questions”                 else Ignorant}library(credibility)function (X, FUN, ..., ) {FUN <-                                Double-check all assumptions +                                Underpromise +                                Save counterintuitive findings for last +                                Find a...

Read more »

Finding Similar European Soccer Clubs (with R & Shiny)

March 17, 2015
By
Finding Similar European Soccer Clubs (with R & Shiny)

Are you a die-hard supporter of one European soccer (football) team (club)? Having a rough season, or just want to watch more matches with passion?This European Team Finder analyzed 126 attributes of the top-flight teams in the marquee n...

Read more »

Tableau 9.0 Connects Directly to R Data Files

March 11, 2015
By
Tableau 9.0 Connects Directly to R Data Files

Tableau 9.0 will be released soon.Tableau 8 already integrates with some R functionality, but 9.0 actually allows direct connection to R data files.Tableau continues to remove friction between itself and R, further justifying its superior Gartner ...

Read more »

R’s Tricky == Operator, or "It depends on what the meaning of the word ‘is’ is"

February 11, 2015
By
R’s Tricky == Operator, or "It depends on what the meaning of the word ‘is’ is"

One scenario where R can trip up a programmer is when using the == operator or its relatives. The help page notes that "NA values are regarded as non-comparable", which introduces some potentially unexpected behavior.As a toy example, look what happens...

Read more »

First Day of the Month, Using R

December 29, 2014
By
First Day of the Month, Using R

Future-proofing is an important concept when designing automated reports. One thing that can get out of hand over time is when you accumulate so many periods of data that your charts start to look overcrowded. You can solve for this by limiting the num...

Read more »

FIFA 15 Analysis with R

September 26, 2014
By
FIFA 15 Analysis with R

Several months ago, I used R to analyze professional soccer players based on their attributes from the video game, FIFA14. Now that FIFA15 is upon us, let's take a similar look.FIFA 15 is a video game by EA Sports that mimics the experience of managing and playing for a soccer team. The game uses the likenesses and attributes...

Read more »

A Look at Random Seeds in R… Or: “85, why can’t you be more like 548?”

August 17, 2014
By
A Look at Random Seeds in R… Or: “85, why can’t you be more like 548?”

Have you ever wondered whether the set.seed() function in R has any quirkiness? This analysis was inspired by a Stack Overflow posting by Wolfgang and I incorporate some of his code.For each seed (1-1000, for this analysis), I took the mean and standard deviation of the first 1,000 random numbers. Then I get the percent of the...

Read more »

R is short for SSIS

May 18, 2014
By

R is Short for SSIS Data scientists often identify a need to join data from different, unlinked servers. One standard tool for accomplishing this is an SSIS package to consolidate the data onto one of the servers. For the analyst who wants to...

Read more »

Assign n Email Addresses to x Cells, Intrinsically (Part II)

March 27, 2014
By

Part I showed the concept and general technique of a method of assigning n email addresses to x cells pseudo-randomly, without the need for maintaining a log of each assignment.The earlier post considered the basic case of each cell being assigned approximately the same quantity of email addresses. In practice, cell sizes often vary. Below is a technique that...

Read more »

Assign n Email Addresses to x Cells, Intrinsically

March 5, 2014
By

Assign n Email Addresses to x Cells, Intrinsically Assign n Email Addresses to x Cells, IntrinsicallySample Use Case:Marketing requests that an email address list be divided randomly into a given number of cells so that each cell would receive a different version of...

Read more »