R and Tolerance Intervals

April 19, 2010

(This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers)

Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The R package tolerance can be used to create a variety of tolerance intervals of interest.

These tolerance limits, taken from the estimated interval, are limits within which a stated proportion of the population is expected to occur. The function normtol.int from the tolerance package can be used to calculate a tolerance interval for data from a normal distribution.

The function arguments include the data itself in a vector denoted x. The confidence level associated with the tolerance interval is specified by alpha, where alpha is the difference between 100% and the confidence level – alpha is 0.05 for 95% confidence. The argument P is the proportion of the data to be included in the tolerance interval. The side argument determines whether a one-sided or two-sided interval is required.

Consider a simulated set of data from a manufacturing process loaded into R, stored as vector object obs, as follows:

obs = c(102.17, 102.45, 106.23, 98.16, 100.82, 101.40, 90.51, 102.51, 97.93,
  96.98, 101.74, 104.34, 103.50, 94.72, 102.80, 103.92, 97.43, 102.76, 100.03,
  107.12, 104.96, 105.32, 87.06, 97.89, 100.23)

A 95% tolerance interval for 90% of data of this type, based on the 25 observations above is created with this code:

> normtol.int(x = obs, alpha = 0.05, P = 0.90, side = 2)
  alpha   P    x.bar 2-sided.lower 2-sided.upper
1  0.05 0.9 100.5192      90.07606      110.9623

The alpha and P are as noted above and the average of the data is reported along with the lower and upper tolerance intervals in this case as we asked for a two-sided interval. This can be easily changed to cover 95% rather than 90% of the data:

> normtol.int(x = obs, alpha = 0.05, P = 0.95, side = 2)
  alpha    P    x.bar 2-sided.lower 2-sided.upper
1  0.05 0.95 100.5192      88.07543      112.9630

The package tolerance can create intervals for other data distributions.

To leave a comment for the author, please follow the link and comment on their blog: Software for Exploratory Data Analysis and Statistical Modelling.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)