MAT8886 Fisher-Tippett theorem and limiting distribution for the maximum

January 12, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

Tomorrow, we will discuss Fisher-Tippett theorem. The idea is that there are only three possible limiting distributions for normalized versions of the maxima of i.i.d. samples https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-00.gif?w=456. For bounded distribution, consider e.g. the uniform distribution on the unit interval, i.e. https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-09.gif?w=456 on the unit interval. Let https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-10.gif?w=456 and https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-11.gif?w=456. Then, for all https://i2.wp.com/freakonometrics.blog.free.fr/public/perso5/max-12.gif?w=456 and https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-13.gif?w=456,

https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-14.gif?w=456

i.e. the limiting distribution of the maximum is Weibull’s.

set.seed(1)
s=1000000
n=100
M=matrix(runif(s),n,s/n)
V=apply(M,2,max)
bn=1
an=1/n
U=(V-bn)/an
hist(U,probability=TRUE,,col="light green",
xlim=c(-7,1),main="",breaks=seq(-20,10,by=.25))
u=seq(-10,0,by=.1)
v=exp(u)
lines(u,v,lwd=3,col="red")

For heavy tailed distribution, or Pareto-type tails, consider Pareto samples, with distribution function https://i2.wp.com/freakonometrics.blog.free.fr/public/perso5/max-05.gif?w=456. Let https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-06.gif?w=456 and https://i2.wp.com/freakonometrics.blog.free.fr/public/perso5/max-07.gif?w=456, then

https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-08.gif?w=456

which means that the limiting distribution is Fréchet’s.

set.seed(1)
s=1000000
n=100
M=matrix((runif(s))^(-1/2),n,s/n)
V=apply(M,2,max)
bn=0
an=n^(1/2)
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(0,7),main="",breaks=seq(0,max(U)+1,by=.25))
u=seq(0,10,by=.1)
v=dfrechet(u,shape=2)
lines(u,v,lwd=3,col="red")

For light tailed distribution, or exponential tails, consider e.g. a sample of exponentially distribution variates, with common distribution function https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-01.gif?w=456. Let https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-02.gif?w=456 and https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-03.gif?w=456, then

https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-04.gif?w=456

i.e. the limiting distribution for the maximum is Gumbel’s distribution.

library(evd)
set.seed(1)
s=1000000
n=100
M=matrix(rexp(s,1),n,s/n)
V=apply(M,2,max)
(bn=qexp(1-1/n))
log(n)
an=1
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,15,by=.25))
u=seq(-5,15,by=.1)
v=dgumbel(u)
lines(u,v,lwd=3,col="red")

Consider now a Gaussian https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-17.gif?w=456 sample. We can use the following approximation of the cumulative distribution function (based on l’Hopital’s rule)

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-15.gif?w=456

as https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-16.gif?w=456. Let https://i2.wp.com/freakonometrics.blog.free.fr/public/perso5/max-18.gif?w=456 and https://i2.wp.com/freakonometrics.blog.free.fr/public/perso5/max-19.gif?w=456. Then we can get

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-20.gif?w=456

as https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-21.gif?w=456. I.e. the limiting distribution of the maximum of a Gaussian sample is Gumbel’s. But what we do not see here is that for a Gaussian sample, the convergence is extremely slow, i.e., with 100 observations, we are still far away from Gumbel distribution,

and it is only slightly better with 1,000 observations,

set.seed(1)
s=10000000
n=1000
M=matrix(rnorm(s,0,1),n,s/n)
V=apply(M,2,max)
(bn=qnorm(1-1/n,0,1))
an=1/bn
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,15,by=.25))
u=seq(-5,15,by=.1)
v=dgumbel(u)
lines(u,v,lwd=3,col="red")

Even worst, consider lognormal observations. In that case, recall that if we consider (increasing) transformation of variates, we are in the same domain of attraction. Hence, since https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-22.gif?w=456, if

https://i2.wp.com/freakonometrics.blog.free.fr/public/perso5/max-23.gif?w=456

then

https://i0.wp.com/freakonometrics.blog.free.fr/public/perso5/max-24.gif?w=456

i.e. using Taylor’s approximation on the right term,

https://i1.wp.com/freakonometrics.blog.free.fr/public/perso5/max-25.gif?w=456

This gives us normalizing coefficients we should use here.

set.seed(1)
s=10000000
n=1000
M=matrix(rlnorm(s,0,1),n,s/n)
V=apply(M,2,max)
bn=exp(qnorm(1-1/n,0,1))
an=exp(qnorm(1-1/n,0,1))/(qnorm(1-1/n,0,1))
U=(V-bn)/an
hist(U,probability=TRUE,col="light green",
xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,40,by=.25))
u=seq(-5,15,by=.1)
v=dgumbel(u)
lines(u,v,lwd=3,col="red")

Credit: illustration is from Maurice Sendak’s popular book where the wild things are, translated in French as Max et les Maximonstres.

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , , , , , , , , , , , , ,

Comments are closed.

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)