Numerical computation of quantiles

Posted on October 23, 2013 by thiagogm in R bloggers | 0 Comments

[This article was first published on Thiago G. Martins » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Recently I had to define a R function that would compute the ${q}$ -th quantile of a continuous random variable based on an user-defined density function. Since the main objective is to have a general function that computes the quantiles for any user-defined density function it needs be done numerically.

Problem statement

Suppose we are interested in numerically computing the ${q}$ -th quantile of a continuous random variable ${X}$ . Assume for now that ${X}$ is defined on ${\Re = (-\infty, \infty)}$ . So, the problem is to find ${x^*}$ such that ${x^* = \{x : F_X(x) = q\}}$ , where ${F_X(x)}$ is the cumulative distribution function of ${X}$ ,

$\displaystyle F_X(x) = \int _{-\infty} ^ {x} f_X(x) dx$

and ${f_X(x)}$ is its density function.

Optimization

This problem can be cast into an optimization problem,

$\displaystyle x^* = \underset{x}{\text{arg min}} (F_X(x) - q)^2$

That said, all we need is a numerical optimization routine to find ${x}$ that minimize ${(F_X(x) - q)^2}$ and possibly a numerical integration routine to compute ${F_X(x)}$ in case it is not available in closed form.

Below is one possible implementation of this strategy in R, using the integrate function for numerical integration and nlminb for the numerical optimization.

CDF <- function(x, dist){
  integrate(f=dist,
            lower=-Inf,
            upper=x)$value
}
objective <- function(x, quantile, dist){
  (CDF(x, dist) - quantile)^2
}
find_quantile <- function(dist, quantile){
  result = nlminb(start=0, objective=objective,
                  quantile = quantile,
                  dist = dist)$par
  return(result)
}

and we can test if everything is working as it should using the following
simple task of computing the 95th-quantile of the standard Gaussian distribution.

std_gaussian <- function(x){
  dnorm(x, mean = 0, sd = 1)
}
find_quantile(dist = std_gaussian,
              quantile = 0.95)
[1] 1.644854
qnorm(0.95)
[1] 1.644854

Random variables with bounded domain

If ${X}$ has a bounded domain, the optimization problem above becomes more tricky and unstable. To facilitate the numerical computations we can transform ${X}$ using a monotonic and well behaved function ${g}$ , so that ${Y = g(X)}$ is defined on the real line. Then we apply the numerical solution given above to compute the ${q}$ -th quantile of ${Y}$ , say ${y^*}$ , and then transform the quantile back to the original scale ${x^* = g^{-1}(y^*)}$ . Finally, ${x^*}$ is the ${q}$ -th quantile of ${X}$ that we were looking for.