FFT (Fast Fourier Transform) of time series — promises and pitfalls towards trading

February 24, 2010
By
FFT (Fast Fourier Transform) of time series  — promises and pitfalls towards trading

Fig 1. FFT transformed time series (EBAY) reconstructed with first three and twenty harmonics, respectively.I see quite a few traders interested in advanced signal processing techniques. It is often instructive to see why they may or may not be useful....

Read more »

ggplot2: Plotting Dates, Hours and Minutes

February 24, 2010
By
ggplot2: Plotting Dates, Hours and Minutes

Plotting timeseries with dates on x-axis and times on y-axis can be a bit tricky in ggplot2. However, with a little trick this problem can be easily overcome. Let’s assume that I wanted to plot when the sun rises in London in 2010. sunriset function in maptools package calculates the sunrise times using algorithms provided

Read more »

PoRtable…

February 24, 2010
By
PoRtable…

Jobless as I might be, I do have some clients for data analysis. I try not to visit them in their office coz then things get really slow and time-consuming. When I can’t escape this, the worst thing is tuning data and software with client. So, I have a USB with portable versions of my

Read more »

Object types in R: The fundamentals

February 24, 2010
By

If you're a self-taught R programmer, you've probably grappled with the different kinds of objects you can use in the language. When should you use a list instead of a vector? What's the difference between a factor and character vector? These questions are easier to answer when you have some of the basics of R's object types down pat,...

Read more »

SoilWeb iPhone App: Beta-Testers?

February 23, 2010
By
SoilWeb iPhone App: Beta-Testers?

iPhone App Screenshot rev 0.2 - icon iphone App Screenshot rev 0.2 - in Fresno   More Updates: The application is now...

Read more »

Reminder: useR! 2010 abstracts due Monday

February 23, 2010
By

Don't forget, if you're planning to attend the R user conference useR! 2010 and are going to present a talk (and if not, why not?), abstracts are due for submission this coming Monday, March 1. That's also the deadline for early-bird registrations, so if you haven't registered yet, now is the time. useR! 2010: The R User Conference

Read more »

Numerical Integration/Differentiation in R: FTIR Spectra

February 23, 2010
By
Numerical Integration/Differentiation in R: FTIR Spectra

  Stumbled upon an excellent example of how to perform numerical integration in R. Below is an example of piece-wise linear and spline fits to FTIR data, and the resulting computed area under the curve. With a high density of points, it seems like the linear approximation is most efficient and sufficiently accurate. With very large...

Read more »

Slides from “R Productivity Environment” webinar

February 23, 2010
By

Thanks to everyone who attended for the great turnout at this morning's live webinar, 7 Ways to Increase your R Productivity. I really appreciate all the feedback and questions, seems like a lot of people are interested in a code editing and debugging environment for R. If you missed the webinar and want to learn about REvolution R Enterprise...

Read more »

Happy Birthday GGD! The 10 Most Popular Posts Since GGD’s Launch

February 23, 2010
By

The first post on Getting Genetics Done was one year ago today. To celebrate, here are the top 10 most viewed posts since GGD launched last year. Incidentally, nine of the ten are tutorials on how to do something in R. Thanks to all the readers and all...

Read more »

Getting Started with Sweave: R, LaTeX, Eclipse, StatET, & TeXlipse

February 23, 2010
By
Getting Started with Sweave: R, LaTeX, Eclipse, StatET, & TeXlipse

Being able to press a single button that runs all your statistical analyses and integrates the output into your final report is a beautiful thing. If you have not already heard, this is what Sweave can do for you. However, getting your computer to run ...

Read more »

Getting Started with Sweave: R, LaTeX, Eclipse, StatET, & TeXlipse

February 23, 2010
By
Getting Started with Sweave: R, LaTeX, Eclipse, StatET, & TeXlipse

Being able to press a single button that runs all your statistical analyses and integrates the output into your final report is a beautiful thing. If you have not already heard, this is what Sweave can do for you. However, getting your computer to run ...

Read more »

Mexico’s Economy

February 22, 2010
By
Mexico’s Economy

Yesterday the INEGI released the GDP figures for 2009, and since it was an annus horribilis for Mexico, I thought I'd put up a couple of charts. Looking through the Banco de Información Económica I found two series of historical seasonally adjusted GDP data available:GDP in 1993 pesos going from 1980 to 2007 GDP in 2003 pesos going...

Read more »

Mexico’s Economy

February 22, 2010
By
Mexico’s Economy

Yesterday the INEGI released the GDP figures for 2009, and since it was an annus horribilis for Mexico, I thought I'd put up a couple of charts. Looking through the Banco de Información Económica I found two series of historical seasonally adjusted GDP data available: GDP in 1993 pesos going from 1980 to 2007 GDP in 2003 pesos going...

Read more »

Time Series Calendar Heat Maps Using R

February 22, 2010
By
Time Series Calendar Heat Maps Using R

I came across an interesting blog that showcased Charting time series as calendar heat maps in R . It is based upon a great algorithm created by Paul Bleicher,CMO of Humedica. I'll let you link to the other blog to see more details on the background ...

Read more »

A quicky..

February 22, 2010
By

If you’re (and you should) interested in principal components then take a good look at this. The linked post will take you by hand to do everything from scratch. If you’re not in the mood then the dollowing R functions will help you. An example. # Generates sample matrix of five discrete clusters that have

Read more »

Sudoku via simulated annealing

February 22, 2010
By
Sudoku via simulated annealing

The Sudoku puzzle in this Sunday edition of Le Monde was horrendously difficult, so after spending one hour with only 4 entries filled, I decided to feed it to the simulated annealing R program I wrote while visiting SAMSI last year. The R program reached the exact (and only) solution in about 6000 iterations, as

Read more »

Siegel-Tukey: a Non-parametric test for equality in variability (R code)

February 22, 2010
By

Daniel Malter just shared on the R mailing list (link to the thread) his code for performing the Siegel-Tukey (Nonparametric) test for equality in variability. Excited about the find, I contacted Daniel asking if I could republish his code here, and he kindly replied “yes”. From here on I copy his note at full. p.s: (The R function can be downloaded from...

Read more »

Speeding up R code: A case study

February 22, 2010
By

On his Psychology and Statistics blog, Jeromy Anglim tells how he was analyzing some data from a skill acquisition experiment. Needing to run a custom R function across 1.3 million data points, Jeromy estimated it would take several hours for the computation to complete. So, Jeromy set out to optimise the code. First, he used the Rprof function, which...

Read more »

ggplot2 (qplot) text size

February 22, 2010
By

I'm trying to learn qplot in ggplot2, and I'm having a difficult time adjusting text sizes. Well, difficult doesn't descibe it - I can't do it at all. The manual tells me I can use cex just like in plot, but it's not working...

Read more »

Time-Space Cloud with R

February 22, 2010
By
Time-Space Cloud with R

Here comes another option to analyze a TimeSpace-Track with R. A lattice cloud plots every recorded trackpoint into a 3d-time-space-cube. As the data (planar point pattern) is marked with the daytime, cluster of everyday routines become visible. Here the direct comparison between a function of density and the time-space-cloud. Code example: cloud(time_hours ~ PPP_selection$x *

Read more »

Post hoc analysis for Friedman’s Test (R code)

February 22, 2010
By
Post hoc analysis for Friedman’s Test  (R code)

My goal in this post is to give an overview of Friedman’s Test and then offer R code to perform post hoc analysis on Friedman’s Test results. (The R function can be downloaded from here) Preface: What is Friedman’s Test Friedman test is a non-parametric randomized block analysis of variance. Which is to say it is a non-parametric version of...

Read more »

The R type system

February 21, 2010
By
The R type system

R is a weird beast. Through it's ancestor the S language, it claims a proud heritage reaching back to Bell Labs in the 1970's when S was created as an interactive wrapper around a set of statistical and numerical subroutines. As a programming language,...

Read more »

The truncated Poisson

February 21, 2010
By
The truncated Poisson

A common model for counts data is the Poisson. There are cases however that we only record positive counts, ie there is a truncation of 0. This is the truncated Poisson model. To study this model we only need the total counts and the sample size. This comes from the sufficient statistic principle as the

Read more »

Visual Interpretation of Principal Coordinates (of) Neighbor Matrices (PCNM)

February 21, 2010
By

Principal Coordinates (of) Neighbor Matrices (PCNM) is an interesting algorithm, developed by P. Borcard and P. Legendre at the University of Montreal, for the multi-scale analysis of spatial structure. This algorithm is typically applied to a distance matrix, computed from the coordinates where some environmental data were collected. The resulting "PCNM vectors" are commonly used to describe...

Read more »

Uh!

February 20, 2010
By

Didn't know this... a data 0 2 4 7+ 25 34 12 5 It's becoming clear that I have learned R in the most unstructured way...I always do it in two stages :ashamed:

Read more »

Design of Experiments – Block Designs

February 20, 2010
By
Design of Experiments – Block Designs

In many experiments where the investigator is comparing a set of treatments there is the possibility of one or more sources of variability in the experimental measurements that can be accounted for during the design stage of the experimentation. For example we might be investigating four different pieces of machinery using say two different operators,

Read more »

Does a Proclamation of Increased Workout Load Matter?

February 20, 2010
By
Does a Proclamation of Increased Workout Load Matter?

I forgot to link this up, but I have a new article (joint with our editor) over at Fantasy Ball Junkie. I run an extremely crude model to see if players who were mentioned in the media as having lost weight, gained muscle, gained speed, got eye surgery...

Read more »

Genetic Algorithm Systematic Trading Development — Part 3 (Python/VBA)

February 20, 2010
By
Genetic Algorithm Systematic Trading Development — Part 3  (Python/VBA)

As mentioned in prior posts, it is not possible to use the standard Weka GUI to instantiate a Genetic Algorithm, other than for feature selection. Part of the reason is that there is no generic algorithm to instantiate a fitness function. The same fl...

Read more »

lme4 stands 4 Linear mixed-effects…

February 19, 2010
By
lme4 stands 4 Linear mixed-effects…

There is a certain hype about mixed (and random) effects among statistician and analysts. You can show some love to Douglas Bates and Martin Maechler for maintaing the lme4 package for our cupid, R I copy the entity of the information of the projects page. Doxygen documentation of the underlying C functions is here. The

Read more »