What’s New in 6.2: Stepwise Regression for Big Data

March 26, 2013
By

by Thomas Dinsmore This is the third in a series of posts highlighting new features in Revolution R Enterprise Release 6.2, which is scheduled for General Availability April 22. This week's post features our new Stepwise Regression capability. The Stepwise process starts with a specified model and then sequentially adds into or removes from the model the variable that...

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and “fights” about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; … Continue reading →The post Python vs R vs SPSS … Can’t All Programmers Just Get Along?...

Read more »

ChainLadder 0.1.5-6 released on CRAN

March 26, 2013
By
ChainLadder 0.1.5-6 released on CRAN

Last week we released version 0.1.5-6 of the ChainLadder package on CRAN. The ChainLadder package provides statistical models, which are typically used for the estimation of outstanding claims reserves in general insurance. The package vignette gives an overview of the package functionality. Output of plot(MackChainLadder(GenIns)) Since the last CRAN release...

Read more »

i Before e Except After c

March 26, 2013
By
i Before e Except After c

When I went to school we were always taught the “i before e, except after c” rule for spelling. But how accurate is this rule? Kevin Marks tweeted today the following: »@uberfacts: There are 923 words in the English language that break the “I before E” rule. Only 44 words actually follow that rule.« Science— Kevin Marks (@kevinmarks)

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and "fights" about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; and, for data geeks, the SAS/SPSS/R/Matlab fight. The truth is, very few of us data geeks (data scientists, data analysts, statisticians, or...

Read more »

A Contest of the Flyer Variety

March 25, 2013
By
A Contest of the Flyer Variety

Guess what?! Charlie decided to institute another round of the epic Flyer Contest! Here’s how it works in 5 easy steps… Print out one of the two flyers below (or this link: here) Post it somewhere public. Examples include: a message board in your department, at your local comic shop, on your mom’s fridge, or

Read more »

Significant P-Values and Overlapping Confidence Intervals

March 25, 2013
By
Significant P-Values and Overlapping Confidence Intervals

There are all sorts of problems with p-values and confidence intervals and I have no intention (or the time) to cover all those problems right now.  However, a big problem is that most people have no idea what p-values really mean. Here is one example of a common problem with p-values and how it relates

Read more »

R – Defining Your Own Color schemes for HeatMaps

March 25, 2013
By
R – Defining Your Own Color schemes for HeatMaps

This post is intended at those who are beginners at R, and is inspired by a small post in Martin's bioblog.First, we plot a "correlation heatmap" using the same logic that Martin uses. In our example, let's use the Movies dataset that comes with ggplot...

Read more »

Computing Maritime Routes in R

March 25, 2013
By
Computing Maritime Routes in R

Thanks to the attention my paper on the cost of Somali piracy has received, a lot of people have approached me to ask how I computed the maritime routes. It is not a very difficult task using R. The key ingredient is a map of the world, that can be rasterized into a grid; all

Read more »

Massive online data stream mining with R

Massive online data stream mining with R

A few weeks ago, the stream package has been released on CRAN. It allows to do real time analytics on data streams. This can be very usefull if you are working with large datasets which are already hard to put in RAM completely, let alone to build some statistical model on it without getting into RAM problems. Most of...

Read more »

Ordinal data with JAGS

March 25, 2013
By

Last week is had a look at the standard R routines for estimating models for ordinal data. This week, I want to have a look at JAGS for examining the same data. To be honest, most of it is taking an example (inhaler) and removing code. To my surpr...

Read more »

Podcast #6: Data Analysis MOOC Post-mortem

March 25, 2013
By

Jeff and I talk about Jeff's recently completed MOOC on Data Analysis.

Read more »

Revolution Newsletter: March 2013

March 25, 2013
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full March edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Get Results Fast with our Quick Start Programs: Need help getting value from predictive...

Read more »

Model assessment (and predictions for RuPaul’s Drag Race Season 5, Episode 9)

March 25, 2013
By
Model assessment (and predictions for RuPaul’s Drag Race Season 5, Episode 9)

Last week, Alaska took it home with her dangerous performance, while Ivy Winters was sent home after going up against Alyssa Edwards. This is sad on many fronts. First, I love me some Ivy Winters. Second, Jinkx had revealed that she had a crush on Ivy, and the relationship that may have flourished between the… Continue reading →

Read more »

Simpler R help tooltips

March 25, 2013
By

I posted yesterday about R Help tooltips. I have started to use them e.g. on the graph gallery However, I’m quickly frustrated with having to write the full url, i.e if I want to add a link to the help … Continue reading →

Read more »

April 18, 2013Third Milano R net meeting: agenda

March 25, 2013
By
April 18, 2013Third Milano R net meeting: agenda

April 18, 2013 - 18:00 - 21:00 Fiori Oscuri Bistrot & Bar (www.fiorioscuri.it) Via Fiori Oscuri, 3 - Milano (Zona Brera) 18.00 - 18.15 Registration 18.15 - 18.30 Welcome presentation Andrea Spanò, Partner at Quantide 18.30 - 19.00 Digit recognition Machine … Continue reading →

Read more »

Submit a talk for the first R in Insurance conference

March 25, 2013
By
Submit a talk for the first R in Insurance conference

The registration for the first R in Insurance is open and there is still time to submit a talk / lightning talk.The conference will take place at Cass Business School in London on Monday, 15 July 2013. This is the Monday following the useR! 2013 confer...

Read more »

Does It Make Sense to Segment Using Individual Estimates from a Hierarchical Bayes Choice Model?

March 24, 2013
By
Does It Make Sense to Segment Using Individual Estimates from a Hierarchical Bayes Choice Model?

I raise this question because we see calls for running segmentation with individual estimates from hierarchical Bayes choice models without any mention of the possible complications that might accompany such an approach.  Actually, all the calls seem to be from those using MaxDiff to analyze the data from incomplete block designs.  For example, if one were to...

Read more »

Writing a MS-Word document using R (with as little overhead as possible)

March 24, 2013
By
Writing a MS-Word document using R (with as little overhead as possible)

The problem: producing a Word (.docx) file of a statistical report created in R, with as little overhead as possible. The solution: combining R+knitr+rmarkdown+pander+pandoc (it is easier than it is spelled). If you get what this post is about, just …Read more »

Read more »

Using R: reading tables that need a little cleaning

March 24, 2013
By
Using R: reading tables that need a little cleaning

Sometimes one needs to read tables that are a bit messy, so that read.table doesn’t immediately recognize the content as numerical. Maybe some weird characters are sprinkled in the table (ever been given a table with significance stars in otherwise numerical columns?). Some search and replace is needed. You can do this by hand, and

Read more »

R Help tooltips

March 24, 2013
By
R Help tooltips

I created a simple jquery plugin to display some information when hovering links to r documentation files hosted at help.r-enthusiasts.com Below is a snapshot from highlight.r-enthusiasts.com that uses the tooltips: See also a live example here: data.frame Using this feature … Continue reading →

Read more »

Tupper’s self-referential formula

March 24, 2013
By
Tupper’s self-referential formula

Can't remember where I first came across this equation but the Tupper's self referential equation, is a very interesting formula that when graphed in two dimension plane it reproduces the formula. \[ \frac{1}{2} I first thought this would be...

Read more »

Not all proportion data are binomial outcomes

March 24, 2013
By
Not all proportion data are binomial outcomes

It really is trivial. Not every proportion is frequency. There are things that have values  bounded between 0 and 1 and yet they are neither probabilities, nor frequencies. Why do I even bother to write this? Because some kinds of…Read more →

Read more »

Rcpp 0.10.3

March 24, 2013
By

Rcpp 0.10.3 is on CRAN. Here is the part of the NEWS file related to this release Changes in R code Prevent build failures on Windowsn when Rcpp is installed in a library path with spaces (transform paths in the … Continue reading →

Read more »

Moving

March 24, 2013
By
Moving

This blog is moving to blog.r-enthusiasts.com. The new one is powered by wordpress and gets a subdomain of r-enthusiasts.com. See you there

Read more »

Web Hosted R Syntax Highlighter

March 24, 2013
By

highlight uses simple jquery command to syntax highlight R code contained in any regular <pre> element. For example, this chunk of code, from the datasets::cars help file. require(stats); require(graphics) plot(cars, xlab = "Speed (mph)", ylab = "Stopping distance (ft)", las … Continue reading →

Read more »

Automatic ARMA/GARCH selection in parallel

March 24, 2013
By

In the original ARMA/GARCH post I outlined the implementation of the garchSearch function. There have been a few requests for the code so … here it is. Quite easy to use too: After the last code line above, fit contains the best (according to the AIC statistic) model, which is the return value of garchFit.

Read more »

Estimating the Decay Rate and the Half-Life of DDT in Trout – Applying Simple Linear Regression with Logarithmic Transformation

Estimating the Decay Rate and the Half-Life of DDT in Trout – Applying Simple Linear Regression with Logarithmic Transformation

This blog post uses a function and a script written in R that were displayed in an earlier blog post. Introduction This is the second of a series of blog posts about simple linear regression; the first was written recently on some conceptual nuances and subtleties about this model.  In this blog post, I will use

Read more »

My Own R Function and Script for Simple Linear Regression – An Illustration with Exponential Decay of DDT in Trout

My Own R Function and Script for Simple Linear Regression – An Illustration with Exponential Decay of DDT in Trout

Here is the function that I wrote for doing simple linear regression, as alluded to in my blog post about simple linear regression on log-transformed data on the decay of DDT concentration in trout in Lake Michigan.  My goal was to replicate the 4 columns of the output from applying summary() to the output of lm().

Read more »

Sponsors