Weighted t-Test in R

April 12, 2012
By

(This article was first published on Nicebread » R, and kindly contributed to R-bloggers)

Although there is a weighted.mean function in R, so far I couldn’t find a implementation of weighted.var and weighted.t.test – here they are (the weighted variance is from Gavin Simpson, found on the R malining list):

?View Code RSPLUS
# weighted variance, inspired by a function from Gavin Simpson on R-Help
var.wt <- function(x, w, na.rm = FALSE) {
if (na.rm) {
w <- w[i <- !is.na(x)]
x <- x[i]
}
sum.w <- sum(w)
return((sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2)))
}
 
weighted.t.test <- function(x, w, mu, conf.level = 0.95, alternative="two.sided", na.rm=TRUE) {
 
if(!missing(conf.level) &
(length(conf.level) != 1 || !is.finite(conf.level) ||
conf.level < 0 || conf.level > 1))
stop("'conf.level' must be a single number between 0 and 1")
 
if (na.rm) {
w <- w[i <- !is.na(x)]
x <- x[i]
}
 
# to achieve consistent behavior in loops, return NA-structure in case of complete missings
if (sum(is.na(x)) == length(x)) return(list(estimate=NA, se=NA, conf.int=NA, statistic=NA, df=NA, p.value=NA))
 
# if only one value is present: this is the best estimate, no significance test provided
if (sum(!is.na(x)) == 1) {
warning("Warning weighted.t.test: only one value provided; this value is returned without test of significance!", call.=FALSE)
return(list(estimate=x[which(!is.na(x))], se=NA, conf.int=NA, statistic=NA, df=NA, p.value=NA))
}
 
x.w <- weighted.mean(x,w, na.rm=na.rm)
var.w <- var.wt(x,w, na.rm=na.rm)
df <- length(x)-1
t.value <- sqrt(length(x))*((x.w-mu)/sqrt(var.w))
se <- sqrt(var.w)/sqrt(length(x))
 
if (alternative == "less") {
pval <- pt(t.value, df)
cint <- c(-Inf, x.w + se*qt(conf.level, df) )
}
else if (alternative == "greater") {
pval <- pt(t.value, df, lower.tail = FALSE)
cint <- c(x.w - se * qt(conf.level, df), Inf)
}
else {
pval <- 2 * pt(-abs(t.value), df)
alpha <- 1 - conf.level
cint <- x.w + se*qt(1 - alpha/2, df)*c(-1,1)
}
 
names(t.value) <- "t"
return(list(estimate=x.w, se=se, conf.int=cint, statistic=t.value, df=df, p.value=pval))
}

To leave a comment for the author, please follow the link and comment on his blog: Nicebread » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.