(This article was first published on

See also: Part 1: QTL Analysis and Quantitative Genetics Part 2: Statistical Methods for QTL Analysis **Econometric Sense**, and kindly contributed to R-bloggers)The 'qtl' package in R allows you to implement QTL analysis using the methods I've previously discussed. The code below is adapted from Broman's documentation "A Brief Tour of R/qtl." ( http://www.rqtl.org/tutorials/rqtltour.pdf ) My code (which follows) is an even briefer tour, relating specifically to the general topics in QTL analysis that I have previously discussed in Part 1 and 2 of this 3 part series of posts. The data set is a simulated backcross data set. The 'summary' function provides some basic descriptive data, such as the number of individuals in the crosses, the number of mapped chromosomes, the number of phenotypes, and number of mapped markers.

Ex:

Backcross

No. individuals: 400

No. phenotypes: 4

Percent phenotyped: 100 100 100 100

No. chromosomes: 19

Autosomes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Total markers: 91

No. markers: 7 5 3 4 10 5 4 4 6 4 4 5 11 3 3 4 4 2 3

Percent genotyped: 72.7

Genotypes (%): AA:50.1 AB:49.9

The plot function (see output below) provides 1) a genetic map with markers 2) phenotype distributions 3) missing data visualizations:

Recall, from the end of my discussion in Part 1 and my discussion in Part 2 I stated:

For maximum likelihood estimation, the following probability density of y

_{i}can be stated as:f(y

_{i}) = Pr(y_{i}|Z_{i}=H_{k}) ‘ the probability of phenotype ‘y’ given genotype H_{k}’= (1/ √ 2π σ) exp[ 1/2 σ

^{2}(y_{i}-X_{i}β +H_{k}γ )^{2}The hypothesis H

_{0}:_{ }γ =0 can be tested using the likelihood ratio test:λ = -2(L

_{0}-L_{1})where L

_{0}represents the likelihood under a restricted model. This is equivalent to the general notion presented in Broman (1997) and my previous post:LOD = Likelihood (effect occurs by QTL linkage) / Likelihood(effect occurs by chance)

The calc.genoprob, scanone, and sim.geno functions allow you to make these calculations with options for expectation maximization, Haley-Knott regression, or multiple imputation. 'summary' and 'max' functions allow you to identify the locations on the chromosomes where the LOD score is maximized and infer the potential location of a QTL. The plot function allows you to plot this data for specified chromosomes.

R Code: (note this code scrolls from left to right- click on the code and use the arrow keys or use the scroll bar at the bottom of the post )

# ------------------------------------------------------------------

# |PROGRAM NAME: EX_QTL_R

# |DATE: 8-13-11

# |CREATED BY: MATT BOGARD

# |PROJECT FILE:

# |----------------------------------------------------------------

# | PURPOSE: DEMONSTRATE STATISTICAL METHODS FOR QTL ANALYSIS IN R

# |

# |------------------------------------------------------------------

# | REFERENCE: R code for "A brief tour of R/qtl"

# | Karl W Broman, kbroman.wisc.edu http://www.rqtl.org

# |-----------------------------------------------------------------

library(qtl) # call package qtl

ls()

############################################################

# exploring backcross data

############################################################

data(fake.bc) # load simulated backcross data (default data in qtl package)

ls()

summary(fake.bc) # summary of info in fake.bc which is and object of class 'cross'

# you can also get this information with specific function calls:

nind(fake.bc) # number of individuals

nphe(fake.bc) # number of phenotypes

nchr(fake.bc) # number of chromosomes

totmar(fake.bc) # number of total markers

nmar(fake.bc) # list markers?

############################################################

# plotting maps, genotypes, phenotypes, and markers

###########################################################

plot(fake.bc) # gives plots data

# you can also call for the plots individually:

plot.missing(fake.bc) # just plot missing genotypes

plot.map(fake.bc) # just plot the genetic map

plot.pheno(fake.bc, pheno.col=1) # just plot phenotype 1 'phe 1'

plot.map(fake.bc, chr=c(1, 4, 6, 7, 15), show.marker.names=TRUE) # specific chromosomes and marker names

plot.missing(fake.bc, reorder=TRUE) # order NA genotypes based on value of phenotype

fake.bc <- drop.nullmarkers(fake.bc) # drop obs with missing genotypes

totmar(fake.bc) # total # markers left

##################################################################

# specifying and estimating the likelihood used for QTL mapping

#################################################################

# From Broman:

# The function calc.genoprob calculates QTL genotype probabilities, conditional

# on the available marker data. These are needed for most of the QTL mapping

# functions. The argument step indicates the step size (in cM) at which the

# probabilities are calculated, and determines the step size at which later

# LOD scores are calculated.

fake.bc <- calc.genoprob(fake.bc, step=1, error.prob=0.01)

# function scanone performs single-QTL genome scan with a normal model.

# methods include: maximum likelihood via the EM algorithm

# and Haley-Knott regression

out.em <- scanone(fake.bc)

out.hk <- scanone(fake.bc, method="hk")

# multiple imputation method using sim.geno utilizing the joint

# genotype distribution, given the observed marker data.

fake.bc <- sim.geno(fake.bc, step=2, n.draws=16, error.prob=0.01)

out.imp <- scanone(fake.bc, method="imp")

# get the maximum LOD score on each chromosome

# can also specify a threshold for LOD

summary(out.em)

summary(out.em, threshold=3)

summary(out.hk, threshold=3)

summary(out.imp, threshold=3)

# function max.scanone returns just the highest peak from output of scanone.

max(out.em) # based on expectation maximization

max(out.hk) # based on Haley-Knott regression

max(out.imp) # based on multiple imputation

##################################################################

# plot LOD scores by chromosome for QTL mapping

#################################################################

plot(out.em, chr=c(2,5)) # just based on em method

plot(out.em, out.hk, out.imp, chr=c(2,5)) # all methods

plot(out.em, chr=c(2)) # zoom in on specified chromosome

To

**leave a comment**for the author, please follow the link and comment on his blog:**Econometric Sense**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...