Weighted t-Test in R

April 12, 2012

(This article was first published on Nicebread » R, and kindly contributed to R-bloggers)

Although there is a weighted.mean function in R, so far I couldn’t find a implementation of weighted.var and weighted.t.test – here they are (the weighted variance is from Gavin Simpson, found on the R malining list):

?View Code RSPLUS
# weighted variance, inspired by a function from Gavin Simpson on R-Help
var.wt <- function(x, w, na.rm = FALSE) {
if (na.rm) {
w <- w[i <- !is.na(x)]
x <- x[i]
sum.w <- sum(w)
return((sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2)))
weighted.t.test <- function(x, w, mu, conf.level = 0.95, alternative="two.sided", na.rm=TRUE) {
if(!missing(conf.level) &
(length(conf.level) != 1 || !is.finite(conf.level) ||
conf.level < 0 || conf.level > 1))
stop("'conf.level' must be a single number between 0 and 1")
if (na.rm) {
w <- w[i <- !is.na(x)]
x <- x[i]
# to achieve consistent behavior in loops, return NA-structure in case of complete missings
if (sum(is.na(x)) == length(x)) return(list(estimate=NA, se=NA, conf.int=NA, statistic=NA, df=NA, p.value=NA))
# if only one value is present: this is the best estimate, no significance test provided
if (sum(!is.na(x)) == 1) {
warning("Warning weighted.t.test: only one value provided; this value is returned without test of significance!", call.=FALSE)
return(list(estimate=x[which(!is.na(x))], se=NA, conf.int=NA, statistic=NA, df=NA, p.value=NA))
x.w <- weighted.mean(x,w, na.rm=na.rm)
var.w <- var.wt(x,w, na.rm=na.rm)
df <- length(x)-1
t.value <- sqrt(length(x))*((x.w-mu)/sqrt(var.w))
se <- sqrt(var.w)/sqrt(length(x))
if (alternative == "less") {
pval <- pt(t.value, df)
cint <- c(-Inf, x.w + se*qt(conf.level, df) )
else if (alternative == "greater") {
pval <- pt(t.value, df, lower.tail = FALSE)
cint <- c(x.w - se * qt(conf.level, df), Inf)
else {
pval <- 2 * pt(-abs(t.value), df)
alpha <- 1 - conf.level
cint <- x.w + se*qt(1 - alpha/2, df)*c(-1,1)
names(t.value) <- "t"
return(list(estimate=x.w, se=se, conf.int=cint, statistic=t.value, df=df, p.value=pval))

To leave a comment for the author, please follow the link and comment on their blog: Nicebread » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)