## MLB Rankings Using the Bradley-Terry Model

August 31, 2013
By

Today, I take my first shots at ranking Major League Baseball (MLB) teams. I see my efforts at prediction and ranking an ongoing process so that my models improve, the data I incorporate are more meaningful, and ultimately my predictions are largely accurate. For the first attempt, let’s rank MLB teams using the Bradley-Terry (BT) model. Before we discuss the rankings, we need...

## A Brief Look at Mixture Discriminant Analysis

July 2, 2013
By

Lately, I have been working with finite mixture models for my postdoctoral work on data-driven automated gating. Given that I had barely scratched the surface with mixture models in the classroom, I am becoming increasingly comfortable with them. With this in mind, I wanted to explore their application to classification because there are times when a single class is clearly made up of...

## High-Dimensional Microarray Data Sets in R for Machine Learning

December 29, 2012
By

Much of my research in machine learning is aimed at small-sample, high-dimensional bioinformatics data sets. For instance, here is a paper of mine on the topic. A large number of papers proposing new machine-learning methods that target high-dimensional data use the same two data sets and consider few others. These data sets are the 1) Alon colon cancer...

## Setting Up the Development Version of R

August 28, 2012
By

My coworkers at Fred Hutchinson regularly use the development version of R (i.e., R-devel) and have urged me to do the same. This post details how I have set up the development version of R on our Linux server, which I use remotely because it is much faster than my Mac. First, I downloaded the R-devel source into ~/local/, which...

## Chapter 2 Solutions – Statistical Methods in Bioinformatics

August 14, 2012
By

As I have mentioned previously, I have begun reading Statistical Methods in Bioinformatics by Ewens and Grant and working selected problems for each chapter. In this post, I will give my solution to two problems. The first problem is pretty straightforward. Problem 2.20 Suppose that a parent of genetic type Mm has three children. Then the parent transmits...

## Textbook – Statistical Methods in Bioinformatics

August 14, 2012
By

As part of my effort to acquaint myself more with biology, bioinformatics, and statistical genetics, I am trying to find as many resources as I can that provide a solid foundation. For instance, I am wading through Molecular Biology of the Cell at a pa...

## Now That We Live in Seattle

August 11, 2012
By

It has been just a few weeks since my wife, my son, and I moved to Seattle so that I could begin my postdoc at The Hutch. Now that we have been here a short time and are settled, we intend to start exploring Seattle, doing typical touristy things as we...

## And Now I Blog Again

August 4, 2012
By

One of my goals for 2012 has been to blog more. Much more. When I first set this goal, I had great aspirations of posting frequently. However, I had a Ph.D. to complete, and quite frankly, it demanded much higher priority. Now that I have submitted my ...

## Goals for 2012

January 9, 2012
By

I have never been one to set New Year’s resolutions. Personally, they instill a dangerous personal freedom that often yield naive, subconscious mentalities, such as I can do anything I want until December 31, and I will change abruptly the next day. ...

## When I was 29…

January 8, 2012
By

Today was my 29th birthday, and I kept things simple: I ate with my wife and my newborn son at a local eatery. Later, my wife cooked steaks for dinner. For the most part, I took the day off in that I did not work on my dissertation. But I did spent muc...