## Puzzle: A path through pairs making squares

April 23, 2012
Ted Harding posed an interesting puzzle challenge on the r-help mailing list recently. Here's the puzzle: Take the numbers 1, 2, 3, etc. up to 17. Can you write out all seventeen numbers in a line so that every pair of numbers that are next to each other, adds up to give a square number? You can figure out...

## Tuning GAMBoost

April 23, 2012
This post describes some of the simulation results which I obtained with the GAMBoost package. The aim of these simulations is to get a feel what I should tune and what I should not tune with GAMBoost. SetupIn the GAMBoost package one can tune qui...

## Example 9.28: creating datasets from tables

April 23, 2012
RThere are often times when it is useful to create an individual level dataset from aggregated data (such as a table). While this can be done using the expand.table() function within the epitools package, it is also straightforward to do directly within R.Imagine that instead of the individual level data, we had only the 2x2 table for the...

## Quantitative palaeolimnology: my book chapters are finally out!

April 23, 2012
Today I received confirmation that the delayed fifth volume in the Developments in Palaeoenvironmental Research series has been published. The book is titled Data Handling and Numerical methods, though it covers more of the latter and, IMHO, is far more interesting than … Continue reading →

## Updates to the Emacs Starter Kit for the Social Sciences

April 23, 2012
I've made some updates to the Emacs Starter Kit for the Social Sciences. The kit builds on Phil Hagelberg's original and Eric Schulte's org-mode version, and incorporates some packages and settings that are particularly useful for the social sciences. ...

## Probit/Logit Marginal Effects in R

April 23, 2012
The common approach to estimating a binary dependent variable regression model is to use either the logit or probit model. Both are forms of generalized linear models (GLMs), which can be seen as modified linear regressions that allow the dependent variable to originate from non-normal distributions. The coefficients in a linear regression model are marginal

## A variance campaign that failed

April 23, 2012
they ought at least be allowed to state why they didn’t do anything and also to explain the process by which they didn’t do anything. First blush One of the nice things about R is that new statistical techniques fall into it.  One such is the glasso (related to the statistical lasso) which converts degenerate … Continue reading...

## Visualising the Path of a Genetic Algorithm

April 23, 2012
We quite regularly use genetic algorithms to optimise over the ad-hoc functions we develop when trying to solve problems in applied mathematics. However it’s a bit disconcerting to have your algorithm roam through a high dimensional solution space while not being able to picture what it’s doing or how close one solution is to another. … Continue reading...

April 22, 2012
I came across a free source of Intraday Forex data while reading Forex Trading with R : Part 1 post. You can download either Daily or Hourly historical Forex data from the FXHISTORICALDATA.COM. The outline of this post: Download and Import Forex data Reference and Plot Intraday data Daily Backtest Intraday Backtest First,I created a

## My bookshelf

April 22, 2012
I'd like to start with something small, and simple. The thing about analyzing the data of your own life is that you are the only one doing the research, so you also have to collect all of the data yourself. This takes effort; and, if you'd like to build a large enough data set to do some really interesting...

## Updates to the Emacs Starter Kit for the Social Sciences

April 22, 2012
I’ve made some updates to the Emacs Starter Kit for the Social Sciences. The kit builds on Phil Hagelberg’s original and Eric Schulte’s org-mode version, and incorporates some packages and settings that are particularly useful for the...

## 118 years of US State Weather Data

April 22, 2012
A recent post on the Junkcharts blog looked at US weather dataand the importance of explaining scales (which in this case went up to 118). Ultimately, it turns out that 118 is the rank of the data compared to the previous 117 years of data (in ascending order, so that 118 is the highest). At … Continue reading...

April 22, 2012
## Machine learning for identification of cars

April 22, 2012
There are plenty of data on internet, however it is raw data. Think for a second about public surveillance cameras - useful to check the traffic on the route or busy place, but anything else? What if you want to know how many cars are on the route? How many car were yesterday at the same time?

## Fancy HTML5 Slides with knitr and pandoc

April 22, 2012
Karthik Ram gave an Introduction to R a couple of weeks ago, and I strongly recommend you to take a look at his cool HTML5 slides. I started trying HTML5 slides last year, and now it is difficult for me to go back to beamer, which I have used for a few...

## Phase space plot of the kicked rotor

April 21, 2012
By
$Phase space plot of the kicked rotor$

In the idealized physical world, a rotor is simply a mass attached to an axis of length , free to move in the plane. Gravity and friction are absent. Such a rotor becomes a kicked rotor if it is periodically hit with a hammer. Every kick transfers momentum to the rotor and the time between

## Calculate the average distance between a given DNA motif within DNA sequences in R

April 21, 2012
Suppose that we want to calculate the expected distance of a DNA motif within a DNA target sequence, if we know the composition bias or the probability distribution (multinomial model) we can compute it just fine.Download the R code <- hereFIRS...

## David Olive’s median confidence interval

As I have discussed in a number of previous posts, the median represents a well-known and widely-used estimate of the “center” of a data sequence.  Relative to the better-known mean, the primary advantage of the median is its much reduced outlier sensitivity.  This post briefly describes a simple confidence interval for the median that is discussed in a paper...

## Rewriting My Code to Run in Parallel (1)

April 21, 2012
As I have mentioned in my previous post I am about to make my code for finding co-integrated pairs run in parallel and more efficient. But before I do so in the actual co-integration code I would like to run some tests to see whether it would improve t...

## Most profitable hedge fund style

April 21, 2012
This is not an investment advice!! Couple of weeks back, during amst-R-dam user group talk on backtesting trading strategies using R, I mentioned the most effective style for hedge funds is relative value statistical arbitrage, I read it somewhere. After … Continue reading →

## R^2 Spectrum

April 21, 2012
We have seen in the previous post, how to calculate the correlation spectrum, but other simple way to show  how the bands correlate to the constituent of interest is to calculate R^2. This way we remove the negative part of the correlation spectru...

## R is not just for nerds….it has drop-down menus!

April 20, 2012
JGR LogoThis post introduces how to use the Java Gui for R (JGR, pronounced Jaguar) along with the Deducer package (manual here) to get a fairly full featured graphical user interface for R.InstallationNote: Be sure you are logged into an account with ...

## Deducer.org reaches 250,000 page views and continues to grow

April 20, 2012
It is difficult for R package authors to know how much (if at all) their packages are being used. CRAN does not calculate or make public download statistics (though this might change in the relatively near future), so authors can't tell if 10 or 10,000 people are using their work. Deducer is in much the same boat.

April 20, 2012
The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full April edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Spring Webinar Series. Our Spring Webinar Series features presentations from Revolution Analytics staff and...

## Reproducible Research: Export Regression Table to MS Word

April 20, 2012
Here's a quick tip for anyone wishing to export results, say a regression table, from R to MS Word:require(R2wd)# install packages required# install software RCOM, RDCOMClient # (I had to restart the R-Session after the above step to get it work)wdGet(...