# Monthly Archives: August 2012

## Probit Models with Endogeneity

August 15, 2012
By
$Probit Models with Endogeneity$

Dealing with endogeneity in a binary dependent variable model requires more consideration than the simpler continuous dependent variable case. For some, the best approach to this problem is to use the same methodology used in the continuous case, i.e. 2 stage least squares. Thus, the equation of interest becomes a linear probability model (LPM). The

## Project Euler — problem 18

August 15, 2012
By

The 18th Euler problem is sorta a route finding problem. It has occupied my mind for two days. Finally I came up to a clever solution. Find the maximum total from top to bottom of the triangle below: 75 95 64 17 … Continue reading →

## Processing sample labels using regular expressions in R

August 15, 2012
By

I am often found in possession of palaeo core data where the sample identifiers contain a core code or label plus the sample depth. Often these are things generated by colleagues who have used other software where for one reason … Continue reading →

## Predicting the memory usage of an R object containing numbers

August 15, 2012
By

To estimate if a certain vector of numbers will fit into memory, you can quite easily predict the memory usage based on the size of the vector. An integer vector will use 4 bytes per number, and a numeric vector… See more ›

## Chapter 2 Solutions – Statistical Methods in Bioinformatics

August 14, 2012
By

As I have mentioned previously, I have begun reading Statistical Methods in Bioinformatics by Ewens and Grant and working selected problems for each chapter. In this post, I will give my solution to two problems. The first problem is pretty straightforward. Problem 2.20 Suppose that a parent of genetic type Mm has three children. Then the parent transmits...

## Some Quirks of the R Language

August 14, 2012
By

R is my favorite programming language.  It's just so useful for getting work done.  Sometimes people will complain that R is a difficult language.  To me, this begs the questions:  difficult for what?  And for whom?  I personally think R is just about the easiest thing in the world for prototyping.  Meaning if you want to quickly crank out...

## Textbook – Statistical Methods in Bioinformatics

August 14, 2012
By

As part of my effort to acquaint myself more with biology, bioinformatics, and statistical genetics, I am trying to find as many resources as I can that provide a solid foundation. For instance, I am wading through Molecular Biology of the Cell at a pa...

## Minimum Expected Shortfall, Part 2

August 14, 2012
By

Previously, we setup the problem of constructing a minimum expected shortfall portfolio.   We exported the portfolio weights from each quarterly rebalancing into R objects. This post will process those weights and compare the portfolio s...

## The Statistical Sleuth (second edition) in R

August 14, 2012
By

For those of you who teach, or are interested in seeing an illustrated series of analyses, there is a new compendium of files to help describe how to fit models for the extended case studies in the Second Edition of the Statistical Sleuth: A Course in...

## Is gas cheaper than it used to be?

August 14, 2012
By

Biostatistician and R user Matt Cooper noticed recently that the price he pays for petrol (gasoline) at the pump in Perth, Australia was about the same as he was paying four years ago. Nonetheless, inflation has marched on over the years, so does that mean petrol is effectively cheaper now than it used to be? And how does the...