## David Varadi’s RSI(2) alternative

July 19, 2009
Here's a quick R implementation of David Varadi's alternative to the RSI(2).  Michael Stokes over at the MarketSci blog has three great posts exploring this indicator: Varadi’s RSI(2) Alternative: The DV(2) RSI(2) vs. DV(2) Last Couple...

## A probability exercise on the Bernoulli distribution

July 18, 2009
What is the probability, flipping a coin 8 times, to obtain the sequence HHTTTHTT? (H = head; T= tail)The theory teaches us that to solve this question, we can simply use the following formula:f(x)=P(X=x)=B(n,p)=\begin{pmatrix}n\\ x \end{pmatrix} \cd...

## Let us practice with some functions of R

July 18, 2009
Given the following data set, compute the arithmetic mean, median, variance, standard deviation; find the greatest and the smaller value, the sum of all values, the square of the sum of all values, the sum of the square of all values; assigne the ranks...

## Book excerpts now posted

July 18, 2009
We've posted excerpts from the book on the book website. The excerpts include Chapter 3 (regression and ANOVA) in its entirety. This demonstrates how the entries (the generic descriptions of software functions) and the worked examples reinforce each ...

## Parsing GEO SOFT files with Python and Sqlite

July 17, 2009
NCBI's GEO database of gene expression data is a great resource, but its records are very open ended. This lack of rigidity was perhaps necessary to accommodate the variety of measurement technologies, but makes getting data out a little tricky. But, a...

## Simple Data Visualization

July 16, 2009
OK, so, I know I already raved about one Hadley Wickham project and how it has changed my life last week. But what can I say, the man is a genius. And if you are using R (and let’s face it, you should be) and you want simple sexy graphs made quick, the man has

## Influence.ME: Simple Analysis

July 16, 2009
With the introduction of our new package for influential data influence.ME, I’m currently writing a manual for the package. This manual will address topics for both the experienced, and the inexperienced users. I will also present much of the content ...

## Missing data, logistic regression, and a predicted values plot (or two)

July 15, 2009
miss attach miss result1 summary(result1) Call: glm(formula = a ~ b, family = binomial(logit)) Deviance Residuals: Min 1Q Median 3Q Max -1.8864 -1.2036 0.7397 0.9425 1.4385 Coefficients: ...

## Job grade plot

July 15, 2009
This plot:was created using the following R code:plot (q9e~q8, type = "n",xlim = c(1,13), ylim = c(1,13),cex.lab=1.25,cex.axis=0.75, col.lab = "#333333", xlab = "Obama job grade",ylab = "Congressional job grade", xaxt ="n", yaxt="n",main="Obama and Co...

## Example 7.5: Replicating a prettier jittered scatterplot

July 15, 2009
The scatterplot in section 7.4 is a plot we could use repeatedly. We demonstrate how to create a macro (SAS, section A.8) and a function (R, section B.5) to do it more easily.SAS%macro logiplot(x=x, y=y, data=, jitterwidth=.05, smooth=50);data lp1;set...

## Building R packages for Windows

July 13, 2009
1. Installing the required tools To build an R package in Windows, you will need to install some additional software tools. These are summarized at http://www.murdoch-sutherland.com/Rtools 1.1 Essential: Rtools This is a collection of unix-like tools that can be run from the DOS command prompt. It also contains the MinGW compilers that are used for

## A recommended book

July 13, 2009
I've been getting a lot of help from this book:While written for S-Plus, nearly everything in it is applicable with R.

## cran2deb: Would you like 1700+ new Debian / R packages ?

July 13, 2009
As I mentioned in my quick write-up of UseR 2009, one of my talks was about cran2deb: a system to turn (essentially) all CRAN packages into directly apt-get-able binary packages. This is essentially a '2.0' version of earlier work with Steffen Moel...

## Some detail on the last plot

July 13, 2009
First we plot approval (app) against date (daten). We also specify a few other things. ylim=c(40,80) specifies that the y axis extends from 40 to 80. xlim=c(-3,210) might seem odd, but we need extra space on the left. pch=16 plots dots, and col="gray" ...

## Obama approval

July 12, 2009
Working some more with time series data. Here we have a graph of Obama job approval numbers, with two LOWESS-fit lines added for trending:Figure1. President Obama job approval, Jan 2009 - present.There's actually some pretty fancy stuff going on there, as the following code shows.polls lfit1 lfit2 plot (app~daten, ylim=c(40,80), xlim=c(-3,210),pch=16, col="gray",cex.lab=1.25,cex.axis=0.75,col.lab = "#777777", xlab="",ylab="Obama...

## useR 2009 in Rennes: Recap and slides

July 12, 2009
I spent most of last week in Rennes, the capital of Brittany in France, as it was time for UseR! 2009, the annual R conference. Francois Husson, Aline Legrand and others at the Agrocampus Ouest had put together a really well-run conference, and it w...

## Causal inference and biostatistics

July 11, 2009
I've been following the discussion on causal inference over at Gelman's blog with quite a bit of interest. Of course, this is in response to Judea Pearl's latest book on causal inference, which differs quite a bit from the theory that had been forwarde...

## The Knapsack Problem

July 10, 2009
David posts a question about how to solve this knapsack problem using the R statistical computing and analysis platform. My reply in the comments seems to have disappeared for a while so here is my proposed solution: