Monthly Archives: June 2011

Stratigraphic diagrams using analogue

June 8, 2011
By
Stratigraphic diagrams using analogue

One of the routine tasks palaeoecologists do is plot data on species composition or geochemical proxies say along a sediment core or stratigraphic sequence. These diagrams are the canonical way of displaying stratigraphic data in this field. An example of a stratigraphic diagram is shown below.

Read more »

Generating unique random IDs

June 7, 2011
By
Generating unique random IDs

Recently I was asked to help create random IDs for someone. At first I thought, ‘Ah yup, 1:x (1,2,3, …,x), job done’. Then I thought that there had to be a R function/package to create better looking IDs, to which I didn’t find one, if there is, please let me know. In the mean time

Read more »

Drafting the Documentation for RTextTools

In preparation for The 4th Annual Conference of the Comparative Policy Agendas Project in Catania, Sicily, our development team has been busy drafting the documentation for RTextTools. In addition to standard documentation of functions, we want to provide quick-start guides, sample datasets, example scripts, and

Read more »

How to fit power laws

June 7, 2011
By
How to fit power laws

A new paper out in Ecology by Xiao and colleagues (in press, here) compares the use of log-transformation to non-linear regression for analyzing power-laws.They suggest that the error distribution should determine which method performs better. When you...

Read more »

A Quantstrat to Build on Part 2

June 7, 2011
By
A Quantstrat to Build on Part 2

As I explore additional functionality of quantstrat and make changes to my original post A Quantstrat to Build On, I will write multiple posts, and hopefully, the finished product will not be so overwhelming to comprehend.  Also, it might highligh...

Read more »

The ‘Big Analytics’ Revolution Starts with R: Webinar June 14

June 7, 2011
By

On Tuesday next week I'll be teaming up with Revolution Analytics' Mike Minelli to give a 30-minute webinar to introduce executives to R, Big Data, and applications of advanced analytics. If there's someone in your company who needs to know about the impact of R on getting value out of data, they can register here. Here's the agenda: The...

Read more »

R books are now showing up in the dollar bin. That’s a good…

June 7, 2011
By
R books are now showing up in the dollar bin. That’s a good…

R books are now showing up in the dollar bin. That’s a good sign!

Read more »

K-Means Clustering on Big Data

June 7, 2011
By
K-Means Clustering on Big Data

In this post Joseph Rickert demonstrates how to build a classification model on a large data set with the RevoScaleR package. A script file for use with Revolution R Enterprise to recreate the analysis below is at the end of the post, and can also be downloaded here -- ed. The k-means (Lloyd) algorithm, an intuitive way to explore...

Read more »

The pros and cons of robust data characterizations

The pros and cons of robust data characterizations

Over the years, I have looked at a lot of data contaminated with outliers, the subject of Chapter 7 of Exploring Data in Engineering, the Sciences, and Medicine.  That chapter adopts the definition of an outlier presented by Barnett and Lewis in their book Outliers in Statistical Data 2nd Edition

Read more »

Fittesmodel.com: A user-friendly way to conduct empirical research together

June 6, 2011
By

(A guest post by Camiel de Koning) ————– When trying to replicate, verify or extend empirical research of others, a researcher generally encounters many time-consuming barriers and there are often many prerequisites. Fittestmodel has the objective to overcome many of these problems, by presenting a webapplication that allows users to: use but not having to install R. quickly incorporate...

Read more »