R: Speeding things up

June 9, 2011
By

R is many things, but it's not exactly speedy like a Patas Monkey. In fact, while it is much faster than many other solutions, R is notably slower than Stata (even inspiring talks that it should be rewritten from scratch!).Fortunately, Radford Neal ha...

Read more »

New patches to speed up R 2.13.0

June 9, 2011
By
New patches to speed up R 2.13.0

I have now released a new collection of 30 patches to speed up R version 2.13.0. You can get them here Assessing how much these patches speed up R is difficult. First of all, the speedup varies tremendously with the type of program. It also varies quite a bit with the machine and compiler used

Read more »

The R-Files: Jeroen Ooms

June 9, 2011
By
The R-Files: Jeroen Ooms

"The R-Files" is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Jeroen Ooms Background: Ph.D. Candidate, Statistics, UCLA Nationality: Netherlands Years Using R: 3 1/2 Known for: Developing web applications for popular R packages including ggplot2, lme4, stockplot and irttool Jeroen Ooms is a statistical consultant and R enthusiast currently pursuing...

Read more »

Rotating disks

June 9, 2011
By
Rotating disks

My neighbour is an half-retired entrepreneur who still runs his electric engine company. A few weekends ago, he came to me with the following physics question related with one of those engines: given a primary disk rotating at the angular speed of ω0 and a secondary disk located on the first one with a centre

Read more »

R and the Geostatistical Software Library data format

June 9, 2011
By
R and the Geostatistical Software Library data format

The *.gslib file format originates from the Geostatistical Software Library, but is also used in the follow-up of that software, i.e. the Stanford Geostatistical Modelling Software (SGeMS). Since not all geostatistical algorithms are implemen...

Read more »

gridExtra – Multiple plots from ggplot2

June 8, 2011
By
gridExtra – Multiple plots from ggplot2

Thanks to this great post http://www.imachordata.com/?p=730 we can now put multiple plots on a display with ggplot2. This provides somewhat similar functionality to ‘par(mfrow=c(x,y))’ which would allow multiple plots with the base plot function. gridExtra doesn’t have quite the same level of options as ‘par’, but the syntax is simple. grid.arrange( graph1, graph2, ncol=2 Simple. ‘grid.table’

Read more »

vRoom vRoom : Speeding up R with C

June 8, 2011
By
vRoom vRoom : Speeding up R with C

Many times you don't want to trouble friends for help with menial tasks like moving furniture. But sometimes you need to step out and ask. Your friends are always happy to help, and after the heavy lifting is done you see how easy it can be. R likes to...

Read more »

Data Mining in R online course taught by Luis Torgo at statistics.com

June 8, 2011
By

An interested PR piece I got from Janet Dobbins: ————— Luis Torgo is teaching an online course, “Data Mining in R: Learning with Case Studies” at statistics.com. The course starts June 17 – July 15. Brief Description: The main goal of this course is to teach users how to perform data mining tasks using R. Instructor(s): Dr. Luis Torgo...

Read more »

Making Simple Packages in R on Windows

June 8, 2011
By
Making Simple Packages in R on Windows

There are any number of short tutorials on making add on R packages on your Windows machine. This is yet another version of that process. I’ve explained what I did in 10 easy steps on the pages, but I’ll give a brief overview here. In the first step I spent some time updating my R

Read more »

A Quantstrat to Build on Part 4

June 8, 2011
By
A Quantstrat to Build on Part 4

When we build a system, we are almost always trying to beat buy and hold by some metric or metrics.  I have not found a demo to compare a quantstrat system with a generic buy and hold system.  Here is the way I accomplish a basic comparison w...

Read more »

A Quantstrat to Build on Part 3

June 8, 2011
By
A Quantstrat to Build on Part 3

This just does the same thing as A Quantstrat to Build on Part 2, but I use sigCrossover and sigComparison instead of sigThreshold as my signal.  Maybe it will help some struggling to understand implementation of the different signal types.  ...

Read more »

Real-time Analytics for Capital Markets with Revolution R

June 8, 2011
By

In the 2011 edition of the Sybase Capital Markets Guide, Revolution Analytics CTO David Champagne talks about the need for up-to-date analytics in Finance, and how you can integrate Revolution R with quality real-time data sources. Here's an excerpt: R represents a radically different approach to the challenges posed by analyzing increasingly large and complex data sets. Because it...

Read more »

David Banks on Reproducible Research

June 8, 2011
By

Just got an email linking to Reproducible Research: A Range of Response, in the new journal Statistics, Politics, and Policy 2(1) by David Banks, who is also the journal's editor. Interestingly, the commentary doesn't mention the journal's policy (if one exists) on the reproducibility of research submitted there. Banks' writing is easy to read, though

Read more »

Stratigraphic diagrams using analogue

June 8, 2011
By
Stratigraphic diagrams using analogue

One of the routine tasks palaeoecologists do is plot data on species composition or geochemical proxies say along a sediment core or stratigraphic sequence. These diagrams are the canonical way of displaying stratigraphic data in this field. An example of … Continue reading →

Read more »

Stratigraphic diagrams using analogue

June 8, 2011
By
Stratigraphic diagrams using analogue

One of the routine tasks palaeoecologists do is plot data on species composition or geochemical proxies say along a sediment core or stratigraphic sequence. These diagrams are the canonical way of displaying stratigraphic data in this field. An example of a stratigraphic diagram is shown below.

Read more »

Generating unique random IDs

June 7, 2011
By
Generating unique random IDs

Recently I was asked to help create random IDs for someone. At first I thought, ‘Ah yup, 1:x (1,2,3, …,x), job done’. Then I thought that there had to be a R function/package to create better looking IDs, to which I didn’t find one, if there is, please let me know. In the mean time

Read more »

Drafting the Documentation for RTextTools

In preparation for The 4th Annual Conference of the Comparative Policy Agendas Project in Catania, Sicily, our development team has been busy drafting the documentation for RTextTools. In addition to standard documentation of functions, we want to provide quick-start guides, sample datasets, example scripts, and

Read more »

How to fit power laws

June 7, 2011
By
How to fit power laws

A new paper out in Ecology by Xiao and colleagues (in press, here) compares the use of log-transformation to non-linear regression for analyzing power-laws.They suggest that the error distribution should determine which method performs better. When you...

Read more »

A Quantstrat to Build on Part 2

June 7, 2011
By
A Quantstrat to Build on Part 2

As I explore additional functionality of quantstrat and make changes to my original post A Quantstrat to Build On, I will write multiple posts, and hopefully, the finished product will not be so overwhelming to comprehend.  Also, it might highligh...

Read more »

The ‘Big Analytics’ Revolution Starts with R: Webinar June 14

June 7, 2011
By

On Tuesday next week I'll be teaming up with Revolution Analytics' Mike Minelli to give a 30-minute webinar to introduce executives to R, Big Data, and applications of advanced analytics. If there's someone in your company who needs to know about the impact of R on getting value out of data, they can register here. Here's the agenda: The...

Read more »

R books are now showing up in the dollar bin. That’s a good…

June 7, 2011
By
R books are now showing up in the dollar bin. That’s a good…

R books are now showing up in the dollar bin. That’s a good sign!

Read more »

K-Means Clustering on Big Data

June 7, 2011
By
K-Means Clustering on Big Data

In this post Joseph Rickert demonstrates how to build a classification model on a large data set with the RevoScaleR package. A script file for use with Revolution R Enterprise to recreate the analysis below is at the end of the post, and can also be downloaded here -- ed. The k-means (Lloyd) algorithm, an intuitive way to explore...

Read more »

The pros and cons of robust data characterizations

The pros and cons of robust data characterizations

Over the years, I have looked at a lot of data contaminated with outliers, the subject of Chapter 7 of Exploring Data in Engineering, the Sciences, and Medicine.  That chapter adopts the definition of an outlier presented by Barnett and Lewis in their book Outliers in Statistical Data 2nd Edition

Read more »

Fittesmodel.com: A user-friendly way to conduct empirical research together

June 6, 2011
By

(A guest post by Camiel de Koning) ————– When trying to replicate, verify or extend empirical research of others, a researcher generally encounters many time-consuming barriers and there are often many prerequisites. Fittestmodel has the objective to overcome many of these problems, by presenting a webapplication that allows users to: use but not having to install R. quickly incorporate...

Read more »

R for Data Mining

June 6, 2011
By

Statistics and data mining often get bundled together, but (in my opinion), they're generally different practices with different goals. As a language designed for statistics, much of R's core functionality is focused on exploring and understanding data: model design, inference, and visualization. But when your goal is simply to get the best predictions from a big data set (without...

Read more »

In case you missed it: May Roundup

June 6, 2011
By

In case you missed them, here are some articles from May of particular interest to R users. A review of "R Cookbook", a new how-to book for R programmers. A detailed example of using the RevoScaleR package to analyze a large airline data set. A new guide for R beginners, "How to Learn R", provides links to R resources,...

Read more »

Shared Ecological Modelling References

June 6, 2011
By

05.06.2011 Today i started to create a list of books and articles about ecological modelling. In this list you will not only find general books about modelling but also books about spatial analysis, image analysis and other (in my opinion) important techniques useful in the context of ecological modelling. For the collection i use “Zotero”

Read more »

10 R One Liners to Impress Your Friends

June 5, 2011
By

Following the trend of one liners for various languages (Haskell, Scala, Python), here's some examples in RMultiply Each Item in a List by 2#listslapply(list(1:4),function(n){n*2})# otherwise(1:4)*2 Sum a List of Numbers#listslapply(list(1:4),sum)# oth...

Read more »

Conway’s Game of Life in R with ggplot2 and animation

June 5, 2011
By

In undergrad I had a computer science professor that piqued my interest in applied mathematics, beginning with Conway’s Game of Life. At first, the Game of Life (not the board game) appears to be quite simple — perhaps, too simple — but it has been widely explored and is useful for modeling systems over time. It has been...

Read more »