Data detective work: work out the numerator or denominator given a percentage

April 7, 2014
By

(This article was first published on Robert Grant's stats blog » R, and kindly contributed to R-bloggers)

Here’s some fun I had today. If you are looking at some published stats and they tell you a percentage but not the numerator & denominator, you can still work them out. That’s to say, you can get your computer to grind through a lot of possible combinations and find which are compatible with the percentage. Usually you have some information about the range in which the numerator or denominator could lie. For example, I was looking at a paper which followed 63 people who had seen a nurse practitioner when they attended hospital, and the paper told me that 18.3% of those who responded had sought further healthcare. But not everyone had answered the question; we weren’t told how many but obviously it was less than or equal to 63. It didn’t take long to knock an R function together to find the compatible numerators given a range of possible denominators and the percentage, and later I did the opposite. Here they are:

 # deducing numerator from percentage and range of possible denominators
   
whatnum<-function(denoms,target,dp) {
	nums<-rep(NA,length(denoms))
	for (i in 1:(length(denoms))) {
		d<-denoms[i]
		lo<-floor(target*d)
		hi<-ceiling(target*d)
		if(round(lo/d, digits=dp)==target) {
			nums[i]<-lo
			if(round(hi/d, digits=dp)==target) {
				warning(paste("More than one numerator is compatible with denominator ",d,"; minima are returned",sep=""))
			}
		}
		else if(round(hi/d, digits=dp)==target) nums[i]<-hi
	}
	res<-cbind(nums[!is.na(nums)],denoms[!is.na(nums)])
	res<-cbind(res,round(res[,1]/res[,2],digits=dp))
	colnames(res)<-c("numerator","denominator","proportion")
	return(res)
}
   
# and the opposite 
whatdenom<-function(nums,target,dp) {
	denoms<-rep(NA,length(nums))
	for (i in 1:(length(nums))) {
		n<-nums[i]
		lo<-floor(n/target)
		hi<-ceiling(n/target)
		if(round(n/lo, digits=dp)==target) {
			denoms[i]<-lo
			if(round(n/hi, digits=dp)==target) {
				warning(paste("More than one denominator is compatible with numerator ",n,"; minima are returned",sep=""))
			}
		}
		else if(round(n/hi, digits=dp)==target) denoms[i]<-hi
	}
	res<-cbind(nums[!is.na(denoms)],denoms[!is.na(denoms)])
	res<-cbind(res,round(res[,1]/res[,2],digits=dp))
	colnames(res)<-c("numerator","denominator","proportion")
	return(res)
}

By typing
whatnum(denoms=(30:63),target=0.183,dp=3)
I could find straight away that the only possibility was 11/60.
That particular paper also had a typo in table 4 ("995.3%") which meant it could be 99.5% or 99.3% or 95.3%. I could run each of those through and establish that it could only possibly have been 95.3%. Handy for those pesky papers that you want to stick in a meta-analysis but are missing the raw numbers!


To leave a comment for the author, please follow the link and comment on his blog: Robert Grant's stats blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.