# Monthly Archives: February 2010

## How to make a mosaic plot in R

February 16, 2010
By

Mosaic plots (aka treemaps) are a great way to visualize hierarchical data. A collection of rectangles represents all the elements to be visualized (customers, news items, blog posts), with the size and color of the rectangles coding attribute. But what makes this chart unique is the arrangement of the elements: where there is hierarchy (customer segments, news topics, post...

## You can Hadoop it! It’s elastic! Boogie woogie woog-ie!

February 16, 2010
By

I just came back from the future and let me be the first to tell you this: Learn some Chinese. And more than just cào nǐ niáng (肏你娘) which your friend in grad school told you means “Live happy with many blessings”. Trust me, I’ve been hanging with Madam Wu and she told me

## For fun: Correlation of US State with the number of clicks on online banners

February 16, 2010
By

“Chitika research” published today a fun small dataset (you can download it from here) in a post titled “The Educated are Harder to Advertise To”. In this post I had three goals in mind: Suggesting another plot instead of the one used in the original post. Emphasizing the “Correlation does not imply causation” rule. Inviting other R lovers (as myself) to find fun...

## A Case Study in Optimising Code in R

February 16, 2010
By

This post presents an experience I had optimising the efficiency of code for a data analysis task in R. I'm not an expert in programming nor code optimisation. However, I thought my experience might make an interesting case study for others at a simila...

## Sugar price seasonality

February 16, 2010
By

Recently, Orion securities have issued a “BUY” recomendation for Cugar ETF. Because, neither I follow the recommendations nor I’m big fan of TA (I have to admit, that I was…), I decided to check sugar price seasonality. Voila, the mean of monthly returns are presented in the graph. February, April and May tend to be negative

## R Web Application – “Hello World” using RApache (~7min video tutorial)

February 16, 2010
By

I just noticed a google buzz from Jeroen ooms, with a Youtube video titled “RApache Hello World + POST arguments + catching errors.” In this ~7 min video tutorial, Jeroen shares with us: How to write ”Hello World” in a website using RApache. How to extract arguments from a form submited by the website visitor (and then inserting it into an “rnorm” function...

February 15, 2010
By

Google Reader is a fantastic way to keep track of new papers that are appearing in many different journals, and also to follow some of the interesting research blogs (and blogs on other topics) that are out there. Google Reader checks websites for you and lets you know of any new material that appears. Instead of

## Genetic Algorithm Systematic Trading Development — Part 1

February 15, 2010
By

I want to start with a brief introduction to what I consider one of the most powerful learning methodologies to come out of Artificial Intelligence in the last several decades-- the Genetic Algorithm. Although it was originally developed to model evol...

## Two-way Analysis of Variance (ANOVA)

February 15, 2010
By

The analysis of variance (ANOVA) model can be extended from making a comparison between multiple groups to take into account additional factors in an experiment. The simplest extension is from one-way to two-way ANOVA where a second factor is included in the model as well as a potential interaction between the two factors. As an example

## R vs. Matlab – a small example

February 15, 2010
By
$R vs. Matlab – a small example$

At the institute I’m working quite a lot of people prefer using Matlab and only a few of them know about R. Today one of my colleagues — who is also an eager user of Matlab — ran into the following problem: He had a vector in hand which consisted of elements. He wanted to