Data detective work: work out the numerator or denominator given a percentage

[This article was first published on Robert Grant's stats blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Here’s some fun I had today. If you are looking at some published stats and they tell you a percentage but not the numerator & denominator, you can still work them out. That’s to say, you can get your computer to grind through a lot of possible combinations and find which are compatible with the percentage. Usually you have some information about the range in which the numerator or denominator could lie. For example, I was looking at a paper which followed 63 people who had seen a nurse practitioner when they attended hospital, and the paper told me that 18.3% of those who responded had sought further healthcare. But not everyone had answered the question; we weren’t told how many but obviously it was less than or equal to 63. It didn’t take long to knock an R function together to find the compatible numerators given a range of possible denominators and the percentage, and later I did the opposite. Here they are:

 # deducing numerator from percentage and range of possible denominators
   
whatnum<-function(denoms,target,dp) {
	nums<-rep(NA,length(denoms))
	for (i in 1:(length(denoms))) {
		d<-denoms[i]
		lo<-floor(target*d)
		hi<-ceiling(target*d)
		if(round(lo/d, digits=dp)==target) {
			nums[i]<-lo
			if(round(hi/d, digits=dp)==target) {
				warning(paste("More than one numerator is compatible with denominator ",d,"; minima are returned",sep=""))
			}
		}
		else if(round(hi/d, digits=dp)==target) nums[i]<-hi
	}
	res<-cbind(nums[!is.na(nums)],denoms[!is.na(nums)])
	res<-cbind(res,round(res[,1]/res[,2],digits=dp))
	colnames(res)<-c("numerator","denominator","proportion")
	return(res)
}
   
# and the opposite 
whatdenom<-function(nums,target,dp) {
	denoms<-rep(NA,length(nums))
	for (i in 1:(length(nums))) {
		n<-nums[i]
		lo<-floor(n/target)
		hi<-ceiling(n/target)
		if(round(n/lo, digits=dp)==target) {
			denoms[i]<-lo
			if(round(n/hi, digits=dp)==target) {
				warning(paste("More than one denominator is compatible with numerator ",n,"; minima are returned",sep=""))
			}
		}
		else if(round(n/hi, digits=dp)==target) denoms[i]<-hi
	}
	res<-cbind(nums[!is.na(denoms)],denoms[!is.na(denoms)])
	res<-cbind(res,round(res[,1]/res[,2],digits=dp))
	colnames(res)<-c("numerator","denominator","proportion")
	return(res)
}

By typing
whatnum(denoms=(30:63),target=0.183,dp=3)
I could find straight away that the only possibility was 11/60.
That particular paper also had a typo in table 4 ("995.3%") which meant it could be 99.5% or 99.3% or 95.3%. I could run each of those through and establish that it could only possibly have been 95.3%. Handy for those pesky papers that you want to stick in a meta-analysis but are missing the raw numbers!


To leave a comment for the author, please follow the link and comment on their blog: Robert Grant's stats blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)