This post is different from all others you’ve seen so far on this site. This is actually not a proper post, but a respond to a comment from my previous post Recommender Systems 101 – a...

In last weeks blog post introducing the new V8 package I showed how you can use context$eval and context$source to execute commands and scripts of JavaScript in R. It turns out that typing context$eval() for each JavaScript command gets annoying very quickly, so the new V8 version 0.3 adds an interactive console...

For our purposes here, data exploration is the application of data visualization and data manipulation techniques to understand the properties of our dataset. We’re going to be looking for interesting features: things that stand out, trends, and relationships between variables. The post Data analysis example with ggplot and dplyr (analyzing ‘supercar’ data, part 2) appeared first on

As John mentioned in his last post, we have been quite interested in the recent study by Fernandez-Delgado, et.al., “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” (the “DWN study” for short), which evaluated 179 popular implementations of common classification algorithms over 120 or so data sets, mostly from the UCI … Continue reading...

Looking just now for an openly licensed graphic showing a set of scatterplots that demonstrate different correlations between X and Y values, I couldn’t find one. So here’s a quick R script for constructing one, based on a Cross Validated question/answer (Generate two variables with precise pre-specified correlation): And here’s an example of the result:

I love cars. The way they sound. The engineering. The craftsmanship. And let’s be honest: fast cars are just fun. Given my love of cars, I frequently watch Top Gear clips on YouTube. A couple of weeks ago, I stumbled across this: Watching the video, I’m thinking, “253 miles per hour? You’ve got to The post

A technique succeeds in mathematical physics, not by a clever trick, or a happy accident, but because it expresses some aspect of physical truth (O. G. Sutton) Imagine three unbalanced coins: Coin 1: Probability of head=0.495 and probability of tail=0.505 Coin 2: Probability of head=0.745 and probability of tail=0.255 Coin 3: Probability of head=0.095 and … Continue reading...

We teach two software packages, R and SPSS, in Quantitative Methods 101 for psychology freshman at Bremen University (Germany). Sometimes confusion arises, when the software packages produce different results. This may be due to specifics in the implemention of a method or, as in most cases, to different default settings. One of these situations occurs

I spend a good amount of time on the programming Q+A site StackOverflow (and a smaller amount of time on its statistics sister site, Cross Validated). Recently this question on Meta Stack Overflow (the website’s discussion forum) caught my attention, raising the question of whether Stack Overflow had become “more negative” recently. It wasn’t the first...

