It pains me to admit it, but even though I had visited their site, created...

Even though it’s still at version 0.4, the ggvis package has quite a bit of functionality and is highly useful for exploratory data analysis (EDA). I wanted to see how geographical visualizations would work under it, so I put together six examples that show how to use various features of ggvis for presenting static &

I’ve posted a new release of the ggRandomForests: Visually Exploring Random Forests to CRAN at (http://cran.r-project.org/package=ggRandomForests) The biggest news is the inclusion of some holiday reading – a ggRandomForests package vignette! ggRandomForests: Visually Exploring a Random Forest for Regression The vignette… Continue reading →

In last weeks blog post introducing the new V8 package I showed how you can use context$eval and context$source to execute commands and scripts of JavaScript in R. It turns out that typing context$eval() for each JavaScript command gets annoying very quickly, so the new V8 version 0.3 adds an interactive console...

For our purposes here, data exploration is the application of data visualization and data manipulation techniques to understand the properties of our dataset. We’re going to be looking for interesting features: things that stand out, trends, and relationships between variables. The post Data analysis example with ggplot and dplyr (analyzing ‘supercar’ data, part 2) appeared first on

As John mentioned in his last post, we have been quite interested in the recent study by Fernandez-Delgado, et.al., “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” (the “DWN study” for short), which evaluated 179 popular implementations of common classification algorithms over 120 or so data sets, mostly from the UCI … Continue reading...

Looking just now for an openly licensed graphic showing a set of scatterplots that demonstrate different correlations between X and Y values, I couldn’t find one. So here’s a quick R script for constructing one, based on a Cross Validated question/answer (Generate two variables with precise pre-specified correlation): And here’s an example of the result:

