Articles by Joseph Rickert

Fun with Simpson’s Paradox: Simulating Confounders

November 21, 2015 | Joseph Rickert

Bob Horton Sr Data Scientist, Microsoft Wikipedia describes Simpson’s paradox as “a trend that appears in different groups of data but disappears or reverses when these groups are combined.” Here is the figure from the top of that article (you can click on the image in Wikipedia then follow ... [Read more...]

Rated R: Recommended Reading

November 19, 2015 | Joseph Rickert

by Joseph Rickert What are you reading? - and what are you recommending to friends, colleagues, and students who want to learn something about R programming? A quick search of Amazon will show that there are several new R books proposed for 2016; but of course, new doesn't necessarily mean better. ... [Read more...]

H2O World 2015

November 12, 2015 | Joseph Rickert

by Joseph Rickert The second, annual H2O World conference finished up yesterday. More than 700 people from all over the US attended the three-day event that was held at the Computer History Museum in Mountain View, California; a venue that pretty much sits well within the blast radius of ground ... [Read more...]

fluent-r: a new R analytics integration library for JVM developers

November 10, 2015 | Joseph Rickert

by David Russell, fluent-r developer fluent-r is a new R analytics integration library for JVM application developers that improves upon existing solutions for integrating R analytics services delivered by popular open source R integration servers DeployR and OpenCPU. The fluent-r library provides a natural-language DSL alongside a simple API that ... [Read more...]

Accessing Bitcoin Data with R

November 4, 2015 | Joseph Rickert

by Joseph Rickert I am not yet a Bitcoin advocate. Nevertheless, I am impressed with the amount of Bitcoin activity and the progress that advocates are making towards having Bitcoin recognized as a legitimate currency. Right now, I am mostly interested in the technology behind bitcoin and the possibility of ... [Read more...]

Differential Privacy Mini-series from Win-Vector

November 3, 2015 | Joseph Rickert

by Nina Zumel Principal Consultant Win-Vector LLC We've just finished off a series of articles on some recent research results applying differential privacy to improve machine learning. Some of these results are pretty technical, so we thought it was worth working through concrete examples. And some of the original results ... [Read more...]

Instrumental Variables

October 29, 2015 | Joseph Rickert

by Joseph Rickert We all "know" that correlation does not imply causation, that unmeasured and unknown factors can confound a seemingly obvious inference. But, who has not been tempted by the seductive quality of strong correlations? Fortunately, it is also well known that a well done randomized experiment can account ... [Read more...]

Party with the First Tribe

October 22, 2015 | Joseph Rickert

by Joseph Rickert In a recent previous post, I wrote about support vector machines, the representative master algorithm of the 5th tribe of machine learning practitioners described by Pedro Domingos in his book, The Master Algorithm. Here we look into algorithms favored by the first tribe, the symbolists, who see ... [Read more...]

The 5th Tribe, Support Vector Machines and caret

October 15, 2015 | Joseph Rickert

by Joseph Rickert In his new book, The Master Algorithm, Pedro Domingos takes on the heroic task of explaining machine learning to a wide audience and classifies machine learning practitioners into 5 tribes*, each with its own fundamental approach to learning problems. To the 5th tribe, the analogizers, Pedro ascribes the ... [Read more...]

Using miniCRAN in Azure ML

October 13, 2015 | Joseph Rickert

by Michele Usuelli Microsoft Data Scientist Azure Machine Learning Studio is a drag-and-drop tool to deploy data-driven solutions. It contains pre-built items including data preparation tools and Machine Learning algorithms. In addition, it allows to include R and Python custom scripts. In order to build powerful R tools, you might ... [Read more...]

R User Groups Highlight R Creativity

October 1, 2015 | Joseph Rickert

by Joseph Rickert I have been a big fan of R user groups since I attended my first meeting. There is just something about the vibe of being around people excited about what they are doing that feels good. From a speaker's perspective, presenting at an R user Group meeting ... [Read more...]

Why Big Data? Learning Curves

September 29, 2015 | Joseph Rickert

by Bob Horton Microsoft Senior Data Scientist Learning curves are an elaboration of the idea of validating a model on a test set, and have been widely popularized by Andrew Ng’s Machine Learning course on Coursera. Here I present a simple simulation that illustrates this idea. Imagine you use ... [Read more...]

The R Consortium Gears Up For Business

September 24, 2015 | Joseph Rickert

by Joseph Rickert This week, the Infrastructure Steering Committee (ISC) of the R Consortium unanimously elected Hadley Wickham as its chair thereby also giving Hadley a seat on the R Consortium board of directors. Congratulations Hadley!! This is a major step forward towards putting the R Consortium in business. Not ... [Read more...]

Reading Financial Time Series Data with R

September 17, 2015 | Joseph Rickert

by Joseph Rickert In a recent post focused on plotting time series with the new dygraphs package, I did not show how easy it is to read financial data into R. However, in a thoughtful comment to the post, Achim Zeileis pointed out a number of features built into the ... [Read more...]

The New Microsoft Data Science User Group Program

September 10, 2015 | Joseph Rickert

by Joseph Rickert We are very pleased to announce that Microsoft will not only continue the Revolution Analytics’ tradition of supporting R user groups worldwide, but is expanding the scope of the user group program. The new 2016 Microsoft Data Science User Group Sponsorship Program is open to all user groups ... [Read more...]

Looking after Datasets

September 1, 2015 | Joseph Rickert

by Antony Unwin University of Augsburg, Germany David Moore's definition of data: numbers that have been given a context. Here is some context for the finch dataset: Fig 1: Illustrations of the beaks of four of Darwin's finches from "The Voyage of the Beagle". Note that only one of these (fortis) ... [Read more...]
1 4 5 6 7 8 17

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)