R Tutorial Series: One-Way ANOVA with Pairwise Comparisons

January 24, 2011
By
R Tutorial Series: One-Way ANOVA with Pairwise Comparisons

When we have more than two groups in a one-way ANOVA, we typically want to statistically assess the differences between each group. Whereas a one-way omnibus ANOVA assesses whether a significant difference exists at all amongst the groups, pairwise com...

Read more »

Hello world!

January 24, 2011
By
Hello world!

I suppose that “Hello World” is the first thing that any blogger should do when starts a blog. So here I go “HELLO WORLD!!!” The aim of this blog is to gather my thoughts and experience around learning R and hopefully to get a lot of insights from my readers. Officially this is my third attempt

Read more »

Paying interest and the number e

January 24, 2011
By
Paying interest and the number e

Suppose I borrow a dollar from you and I’ll pay you 100% interest at the end of the year.  How much money will you have then? $1 * (1 + 1) = $2 What happens if instead the interest is calculated as  50% twice in the year? $1 * (1.5 * 1.5) = $2.25 After … Continue reading...

Read more »

Using RClimate To Retrieve Climate Series Data

January 23, 2011
By
Using RClimate To Retrieve Climate Series Data

This post shows how to use RClimate.txt to retrieve a climate time series and write a csv file in 5 lines of R script. One of my readers, Robert, wants to be able to download climate time series data and … Continue reading →

Read more »

Using R for Introductory Statistics, Chapter 5

January 23, 2011
By
Using R for Introductory Statistics, Chapter 5

Any good stats book has to cover a bit of basic probability. That's the purpose of Chapter 5 of Using R for Introductory Statistics, starting with a few definitions: Random variable A random number drawn from a population. A random variable is a variable for which we define a range of possible values and...

Read more »

Using R for Introductory Statistics, Chapter 5

January 23, 2011
By
Using R for Introductory Statistics, Chapter 5

Any good stats book has to cover a bit of basic probability. That's the purpose of Chapter 5 of Using R for Introductory Statistics, starting with a few definitions: Random variable A random number drawn from a population. A random variable is ...

Read more »

Blackbox trading Strategy using Rapidminer and R

January 23, 2011
By
Blackbox trading Strategy using Rapidminer and R

This my first post in 2011. this post has cost me a bit more than usual, but I hope it meets expectations. The aim of this tutorial is to generate an algorithm based on black box trading, with all the necessary elements for evaluation. That is a first post of several, in order to explore the problems, features of...

Read more »

Blackbox trading Strategy using Rapidminer and R

January 23, 2011
By
Blackbox trading Strategy using Rapidminer and R

This my first post in 2011. this post has cost me a bit more than usual, but I hope it meets expectations. The aim of this tutorial is to generate an algorithm based on black box trading, with all the necessary elements for evaluation. That is a first post of several, in order to explore the problems, features of...

Read more »

CRANberries is now tweeting

January 23, 2011
By

The CRANberries service (which reports on new and updated CRAN packages for the R language and environment) is now tweeting about new packages. Simply follow @CRANberriesFeed to receive theses messages. For the technically minded, adding this to the...

Read more »

STATA: Regular expressions

January 23, 2011
By

A regular expression allows you to do a moderately fancy search (and replace if you want). So say you wanted to replace all the "Dennis"s in a variable with "Awesome"s, but only if they're at the end of the line. You could try:-replace PBFnamevar = r...

Read more »

Merging Multiple Data Frames in R

January 23, 2011
By
Merging Multiple Data Frames in R

Earlier I had a problem that required merging 3 years of trade data, with about 12 csv files per year. Merging all of these data sets with pairwise left joins using the R merge statement worked (especially after correcting some errors pointed out by Ha...

Read more »

The Art of Exploratory Data Analysis

The Art of Exploratory Data Analysis

This blog is about the art of exploratory data analysis, which is also the subject of my new book, Exploring Data in Engineering, the Sciences, and Medicine (http://www.oup.com/us/ExploringData).  This art is appropriate in situations where y...

Read more »

Flexibility of R Graphics

January 21, 2011
By
Flexibility of R Graphics

(note scroll all the way down to see 'old code' and 'new more flexible code' Recall and older post that presented overlapping density plots using R (Visualizing Agricultural Subsidies by KY County) see image below.The code I used to produce this plot m...

Read more »

Posted Question for R Users

January 21, 2011
By
Posted Question for R Users

I recently undertook a project where a colleague had about 12 .csv files that they wanted to merge. Each file had a common (key) variable 'Partner' (which is trading partner) with differing columns (variables) except for the common key variable. Actual...

Read more »

Hard drive occupation prediction with R – part 2 – Getting the probability distribution

Hard drive occupation prediction with R – part 2 – Getting the probability distribution

On the first article, we saw a quick-and-dirty method to predict disk space exhaustion when the usage pattern is rigorously linear. We did that by importing our data into R and making a linear regression. In this article we will see the problems with that method, and deploy a more robust solution. Besides robustness, we will also see how we can generate...

Read more »

Hard drive occupation prediction with R – part 2

Hard drive occupation prediction with R – part 2

On the first article, we saw a quick-and-dirty method to predict disk space exhaustion when the usage pattern is rigorously linear. We did that by importing our data into R and making a linear regression. In this article we will see the problems wit...

Read more »

Volcanic Solar Dimming, ENSO and Temperature Anomalies

January 21, 2011
By
Volcanic Solar Dimming, ENSO and Temperature Anomalies

In previous posts I have shown plots of global temperature anomaly, volcano and Nino34 trends (here , here). In this post , I want to further  explore the role of volcanic eruptions and Nino34 phases (El Nino, La Nina) on … Continue reading →

Read more »

Learning R through baseball: sab-R-metrics

January 21, 2011
By
Learning R through baseball: sab-R-metrics

The words "statistics" and "baseball" are often found near each other, but there's a lot more to statistics than dividing the number of hits by the number of swings to get a batting average. And there's a lot more to sabermetrics -- the statistical analysis of baseball -- than averages, too. Many baseball fans are also stats geeks (and...

Read more »

Embedding a time series with time delay in R

January 21, 2011
By
Embedding a time series with time delay in R

I’ve recently been looking at Martin Trauth‘s book MATLAB® Recipes for Earth Sciences to try to understand what some of my palaeoceanography colleagues are doing with their data analyses (lots of frequency domain time series techniques and a preponderance of … Continue reading →

Read more »

Relationship Between SAT & College Retention

January 21, 2011
By
Relationship Between SAT & College Retention

Here is a quick analysis of the relationship between SAT score and student retention. The data is from the Integrated Postsecondary Education Data System (IPEDS) and analyzed using R. This was a quick analysis and would be careful about making any strong conclusions. The source for running this analysis along with some additional graphics that

Read more »

Interesting volatility measurement, part 2

January 21, 2011
By
Interesting volatility measurement, part 2

A few weeks ago I have mentioned about an interesting volatility prediction. It is based on two periods of historical volatility (standard deviation). The remaining question was – does it really works? I could not give the answer, because I didn’t have VIX futures data at that time. Later on, I was contacted by Brian

Read more »

Model for nothing – and the bootstrap for free

January 21, 2011
By
Model for nothing – and the bootstrap for free

Reconstructing phylogenies is an interesting task, sadly one that often requires to navigate between a multitude of software. To add an unnecessary layer of complexity to the whole thing, most of these softwares speaks different languages, and requires the user to do endless conversions from fasta to phylip to nexus to whatever new format they

Read more »

Disable auto-update from R (Windows)

January 21, 2011
By

There are two major threats to complex MCMC estimations:Wrong energy settings (hibernate after 2 hours of inactivity)Automatic Updates (install updates at 3 a.m.)I thought about the latter threat. At times, you may hand some R code to other co-workers,...

Read more »

Disable auto-update from R (Windows)

January 21, 2011
By

There are two major threats to complex MCMC estimations:Wrong energy settings (hibernate after 2 hours of inactivity)Automatic Updates (install updates at 3 a.m.)I thought about the latter threat. At times, you may hand some R code to other co-workers,...

Read more »

Ultraedit to R

January 21, 2011
By

My favorite text editor on Windows is Ultraedit, but it does not have a nice interface to R in the same vein as Emacs/ESS, Tinn-R, or Eclipse. (I have never used Eclipse.) Ultraedit is powerful enough to submit whole R programs...

Read more »

Stop your figures jumping about in odfWeave

January 21, 2011
By

If you use odfWeave to produce figures, you will probably find they jump about when scrolling through the document – because the figures and figure frames are anchored in openoffice to the paragraph and not “as character”. The only way to fix this in a finished document is to right-click on the figures and select

Read more »

Stop your figures jumping about in odfWeave

January 21, 2011
By

If you use odfWeave to produce figures, you will probably find they jump about when scrolling through the document – because the figures and figure frames are anchored in openoffice to the paragraph and not “as character”. ...

Read more »

How do you explain reproducible research to clients?

January 21, 2011
By

Most of the statistics work I do now is reproducible research – this can offer a big advantage for clients but of course that doesn’t necessarily mean they realise it … Below is a text we have been pasting in at the bottom of the source documents (and which therefore appears in the pdf’s) to

Read more »

How do you explain reproducible research to clients?

January 21, 2011
By

Most of the statistics work I do now is reproducible research - this can offer a big advantage for clients but of course that doesn't necessarily mean they realise it ... Below is a text we have been pasting in at the bottom of the source d...

Read more »