Bias of Hill Estimators

Posted on January 28, 2014 by arthur charpentier in R bloggers | 0 Comments

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the MAT8595 course, we’ve seen yesterday Hill estimator of the tail index. To be more specific, we did see see that if $http://latex.codecogs.com/gif.latex?\overline{F}(x)=C%20x^{-\alpha}$ , with $http://latex.codecogs.com/gif.latex?\alpha%3E0$ , then Hill estimators for $http://latex.codecogs.com/gif.latex?\alpha$ are given by

$http://latex.codecogs.com/gif.latex?\widehat{\alpha}_k%20=%20\left[\frac{1}{k}\sum_{i=0}^{k-1}%20\log%20X_{n-i,n}%20-\log%20X_{n-k,n}\right]^{-1}$
for $http://latex.codecogs.com/gif.latex?k\in\{1,2,\cdots,n\}$ . Then we did say that $http://latex.codecogs.com/gif.latex?\widehat{\alpha}_k$ satisfies some consistency in the sense that $http://latex.codecogs.com/gif.latex?\widehat{\alpha}_k%20\overset{\mathbb{P}}{\rightarrow}%20\alpha$ if $http://latex.codecogs.com/gif.latex?k\rightarrow\infty$ , but not too fast, i.e. $http://latex.codecogs.com/gif.latex?k/n\rightarrow0$ (under additional assumptions on the rate of convergence, it is possible to prove that $http://latex.codecogs.com/gif.latex?\widehat{\alpha}_k%20\overset{a.s.}{\rightarrow}%20\alpha$ ). Further, under additional technical conditions

$http://latex.codecogs.com/gif.latex?\sqrt{k}\left(\widehat{\alpha}_k-\alpha\right)\overset{\mathcal%20L}{\rightarrow}\mathcal{N}(0,\alpha^2)$

In order to illustrate this point, consider the following code. First, let us consider a Pareto survival function, and the associated quantile function

> alpha=1.5
> S=function(x){ifelse(x>1,x^(-alpha),1)}
> Q=function(p){uniroot(function(x) S(x)-(1-p),lower=1,upper=1e+9)$root}

The code here is obviously too complicated, since this power function can easily be inverted. But later on, we will consider a more complex survival function. Here are the survival function, and the quantile function,

> u=seq(0,5,by=.01)
> plot(u,Vectorize(S)(u),type="l",col="red")
> u=seq(0,99/100,by=.01)
> plot(u,Vectorize(Q)(u),type="l",col="blue",ylim=c(0,20))

Here, we need the quantile function to generate a random sample from this distribution,

> n=500
> set.seed(1)
> X=Vectorize(Q)(runif(n))

Hill plot is here

> library(evir)
> hill(X)
> abline(h=alpha,col="blue")

We can now generate thousands of random samples, and see how those estimators behave (for some specific $http://latex.codecogs.com/gif.latex?k$ ‘s).

> ns=10000
> HillK=matrix(NA,ns,10)
> for(s in 1:ns){
+ X=Vectorize(Q)(runif(n))
+ H=hill(X,plot=FALSE)
+ hillk=function(k) H$y[H$x==k]
+ HillK[s,]=Vectorize(hillk)(15*(1:10))
+ }

and if we compute the average,

> plot(15*(1:10),apply(HillK,2,mean)

we do get a series of estimators that can be considered as unbiased.

So far, so good. Now, recall that being in the max-domain of attraction of the Fréchet distribution does not mean that $http://latex.codecogs.com/gif.latex?\overline{F}(x)=C%20x^{-\alpha}$ , with $http://latex.codecogs.com/gif.latex?\alpha%3E0$ , but is means that

$http://latex.codecogs.com/gif.latex?\overline{F}(x)=%20x^{-\alpha}%20\mathcal{L}(x)$

for some slowly varying function $http://latex.codecogs.com/gif.latex?\mathcal{L}$ , not necessarily constant! In order to understand what could happen, we have to be slightly more specific. And this can be done only by looking at second order regular variation property of the survival function. Assume, here that there is some auxilary function $http://latex.codecogs.com/gif.latex?a$ such that

$http://latex.codecogs.com/gif.latex?\lim_{t\rightarrow\infty}\frac{\overline{F}(xt)/\overline{F}(t)-x^{-\alpha}}{a(t)}=x^{-\alpha}\frac{1-x^{-\beta}}{\beta}{}$

This (positive) constant $http://latex.codecogs.com/gif.latex?\beta$ is – somehow – related to the speed of convergence of the ratio of the survival functions to the power function (see e.g. Geluk et al. (2000) for some examples).

To be more specific, assume that

$http://latex.codecogs.com/gif.latex?\overline{F}(x)=\underbrace{C(1+x^{-\beta})}_{\mathcal{L}(x)}\cdot%20%20x^{-\alpha}$

then, the second order regular variation property is obtained using $http://latex.codecogs.com/gif.latex?a(t)=\beta%20t^{-\beta}$ , and then, if $http://latex.codecogs.com/gif.latex?k$ goes to infinity too fast, then the estimator will be biased. More precisely (see Chapter 6 in Embrechts et al. (1997)), if $http://latex.codecogs.com/gif.latex?k=O(n^{2\beta/(\alpha+2\beta)})$ , then, for some $http://latex.codecogs.com/gif.latex?\lambda%3E0$ ,

$http://latex.codecogs.com/gif.latex?\sqrt{k}\left(\widehat{\alpha}_k-\alpha\right)\overset{\mathcal%20L}{\rightarrow}\mathcal{N}\left(\frac{\alpha^3}{\beta-\alpha}\lambda,\alpha^2\right)$

The intuitive interpretation of this result is that if $http://latex.codecogs.com/gif.latex?k$ is too large, and if the underlying distribution is not exactly a Pareto distribution (and we do have this second order property), then Hill’s estimator is biased. This is what we mean when we say

if $http://latex.codecogs.com/gif.latex?k$ is too large, $http://latex.codecogs.com/gif.latex?\widehat{\alpha}_k$ is a biased estimator
if $http://latex.codecogs.com/gif.latex?k$ is too small, $http://latex.codecogs.com/gif.latex?\widehat{\alpha}_k$ is a volatile estimator

(the later comes from properties of a sample mean: the more observations, the less the volatility of the mean).

Let us run some simulations to get a better understanding of what’s going on. Using the previous code, it is actually extremly simple to generate a random sample with survival function

$http://latex.codecogs.com/gif.latex?\overline{F}(x)=\underbrace{C(1+x^{-\beta})}_{\mathcal{L}(x)}\cdot%20%20x^{-\alpha}$

> beta=.5
> S=function(x){+ ifelse(x>1,.5*x^(-alpha)*(1+x^(-beta)),1) }
> Q=function(p){uniroot(function(x) S(x)-(1-p),lower=1,upper=1e+9)$root}

If we use the code above. Here, with

> n=500
> set.seed(1)
> X=Vectorize(Q)(runif(n))

the Hill plot becomes

> library(evir)
> hill(X)
> abline(h=alpha,col="blue")

But it’s based on one sample, only. Again, consider thousands of samples, and let us see how Hill’s estimator is behaving,

so that the (empirical) mean of those estimator is

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Bias of Hill Estimators

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)