On http://www.bakadesuyo.com, there was recently an interesting discussion about infidelity, the key question being "at what ages are men and women most likely to have affairs?" The discussion is based on some graphs, e.g. The source is a paper b...

This is a quick post to address comments raised in the Time Series Matching post. I will show a very simple example of backtesting a Time Series Matching strategy using a distance weighted prediction. I have to warn you, the strategy’s performance is worse then the Buy and Hold. I used the code from Time

For a much better looking version of this post (where code is actually readable!), see this Github repository, which also contains some of the example datasets I use and a literate programming version of this tutorial. Introduction This is a bare-bones introduction to ggplot2, a visualization package in R. It assumes no knowledge of R

Last Saturday, the New York Times published a feature article on the wealthiest 1% of Americans. The on-line version of the article included interactive features like this interactive map showing where your household ranks in the country and in local regions. The print edition, however, included some different (and necessarily static) representations of US wealth data, such as this...

I recently posted an introduction to the Kaggle Algorithmic Trading Challenge, which I competed in.I said that I would post about my experiences, and this is hopefully the first of a series. We were given tick data from the London Stock Exchange(specifically, the FTSE 100) over random time intervals during parts of 37 days. Each data row...

Quantitative Finance, Technical Trading & Analysis. Fotis Papailias, Dimitrios Thomakos Fotis Quantitative Finance & Technical Trading R-Code Yahoo Finance Data LoadingHere is an R script that downloads Yahoo Finance Data without the need of additional packages/libraries. In the .zip file is the code with an example on how to use it. Download the code here: You can also...

Parallel computation may seem difficult to implement and a pain to use, but it is actually quite simple to use. The foreach package provides the basic loop structure, which can utilize various parallel backends to execute the loop in parallel. First, let's go over the basic structure of a foreach loop. To get the foreach package, run the following...

David Sparks writes: I am experimenting with the mapping/visualization of survey response data, with a particular focus on using transparency to convey uncertainty. See some examples here. Do you think the examples are successful at communicating both local values of the variable of interest, as well as the lack of information in certain places? Also, The post How...

Running R in the cloud So you want to run R in the cloud so you can set your Gibbs sampling off, forget about it, and not be paranoid about power cuts and reboots. Andrew Gelman hosted a good debate on the pros and cons of R in the cloud on his blog. ...

This is a bare-bones introduction to ggplot2, a visualization package in R. It assumes no knowledge of R. For a better-looking version of this post, see this Github repository, which also contains some of the example datasets I use and a literate programming version of this tutorial. Preview Let’s start with a...

The TIOBE index ranks popularity of programming languages according to their prevalence on the web. Back in February last year, the R language had risen to #25 in the charts, overtaking both SAS and Matlab. Earlier this month, TIOBE published its annual rankings of programming language popularity for 2011 and R has risen once again: it now ranks #19...

How symmetric are the returns of the S&P 500? How does the skewness change over time? Previously We looked at the predictability of kurtosis and skewness in S&P constituents. We didn’t see any predictability of skewness among the constituents. Here we look at skewness from a different angle. The data Daily log returns of the … Continue reading...

If you’re one of the R-bloggers or useRs, most probably you had heard about Crdata.org. In the early day, they are two very R related cloud computing services, one is CloudNumbers, another is CrData.org. Recently, we (may) received an email by Hamid ...

I was just looking through the programming language statistics on Project Euler. It shows that only 7% of the problems have been solved in R, whereas 8% have been solved on any kind of spreadsheet. This is outrageous!Let's look at the solution of...

So, let's crawl some data out of facebook using R. Don't get too excited though, this is just a weekend whatif project. Anyway, so for example, I want to download some photos where I'm tagged. First, we need an access token from facebook. I don't know how to get this programmatically, so let's get one manually, log on to facebook...

Fred Schiff writes: I’m writing to you to ask about the “R-squared” approximation procedure you suggest in your 2004 book with Dr. Hill. I’m a media sociologist at the University of Houston. I’ve been using HLM3 for about two years. Briefly about my data. It’s a content analysis of The post R-squared...

Merging two data.frame objects in R is very easily done by using the merge function. While being very powerful, the merge function does not (as of yet) offer to return a merged data.frame that preserved the original order of, one of the two merged, data.frame objects. In this post I describe this problem, and offer Read more...

In my last post, I discussed the Hampel filter, a useful moving window nonlinear data cleaning filter that is available in the R package pracma. In this post, I briefly discuss this moving window filter in a little more detail, focusing on two important practical points: the choice of the filter’s local outlier detection threshold, and the question of...

It's been a few weeks since I last posted. Sorry about that. Unfortunately, sometimes you come home from work just not wanting to look at a computer.I'm working on a series of posts requested by a few friends. They would like to see m...