# MAT8886 Fisher-Tippett theorem and limiting distribution for the maximum

January 12, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

Tomorrow, we will discuss Fisher-Tippett theorem. The idea is that there are only three possible limiting distributions for normalized versions of the maxima of i.i.d. samples . For bounded distribution, consider e.g. the uniform distribution on the unit interval, i.e. on the unit interval. Let and . Then, for all and ,

i.e. the limiting distribution of the maximum is Weibull's.

set.seed(1)
s=1000000
n=100
M=matrix(runif(s),n,s/n)
V=apply(M,2,max)
bn=1
an=1/n
U=(V-bn)/an
hist(U,probability=TRUE,,col="light green",
xlim=c(-7,1),main="",breaks=seq(-20,10,by=.25))
u=seq(-10,0,by=.1)
v=exp(u)
lines(u,v,lwd=3,col="red")

For heavy tailed distribution, or Pareto-type tails, consider Pareto samples, with distribution function . Let and , then

which means that the limiting distribution is Fréchet's.

set.seed(1)
s=1000000
n=100
M=matrix((runif(s))^(-1/2),n,s/n)
V=apply(M,2,max)
bn=0
an=n^(1/2)
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(0,7),main="",breaks=seq(0,max(U)+1,by=.25))
u=seq(0,10,by=.1)
v=dfrechet(u,shape=2)
lines(u,v,lwd=3,col="red")

For light tailed distribution, or exponential tails, consider e.g. a sample of exponentially distribution variates, with common distribution function . Let and , then

i.e. the limiting distribution for the maximum is Gumbel's distribution.

library(evd)
set.seed(1)
s=1000000
n=100
M=matrix(rexp(s,1),n,s/n)
V=apply(M,2,max)
(bn=qexp(1-1/n))
log(n)
an=1
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,15,by=.25))
u=seq(-5,15,by=.1)
v=dgumbel(u)
lines(u,v,lwd=3,col="red")

Consider now a Gaussian sample. We can use the following approximation of the cumulative distribution function (based on l'Hopital's rule)

as . Let and . Then we can get

as . I.e. the limiting distribution of the maximum of a Gaussian sample is Gumbel's. But what we do not see here is that for a Gaussian sample, the convergence is extremely slow, i.e., with 100 observations, we are still far away from Gumbel distribution,

and it is only slightly better with 1,000 observations,

set.seed(1)
s=10000000
n=1000
M=matrix(rnorm(s,0,1),n,s/n)
V=apply(M,2,max)
(bn=qnorm(1-1/n,0,1))
an=1/bn
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,15,by=.25))
u=seq(-5,15,by=.1)
v=dgumbel(u)
lines(u,v,lwd=3,col="red")

Even worst, consider lognormal observations. In that case, recall that if we consider (increasing) transformation of variates, we are in the same domain of attraction. Hence, since , if

then

i.e. using Taylor's approximation on the right term,

This gives us normalizing coefficients we should use here.

set.seed(1)
s=10000000
n=1000
M=matrix(rlnorm(s,0,1),n,s/n)
V=apply(M,2,max)
bn=exp(qnorm(1-1/n,0,1))
an=exp(qnorm(1-1/n,0,1))/(qnorm(1-1/n,0,1))
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,40,by=.25))
u=seq(-5,15,by=.1)
v=dgumbel(u)
lines(u,v,lwd=3,col="red")

Credit: illustration is from Maurice Sendak's popular book where the wild things are, translated in French as Max et les Maximonstres.

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , , , , , , , ,

Comments are closed.