(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers)
Original Code# One of the basic tasks in item response theory is estimating the ability of a test taker from the responses to a series of items (questions).
# Let's draw the same pool of items that we have used on several previous posts:
# First let's imagine we have a pool of 100 3PL items.
set.seed(101)
npool = 500
pool = as.data.frame(cbind(item=1:npool, a=abs(rnorm(npool)*.25+1.1), b=rnorm(npool), c=abs(rnorm(npool)/7.5+.1)))
summary(pool)
# Drawing on code from a previous post we can calculate several useful functions:
# (http://www.econometricsbysimulation.com/2012/11/matching-item-information.html)
# Each item has a item characteristic curve (ICC) of:
PL3 = function(theta,a, b, c) c+(1-c)*exp(a*(theta-b))/(1+exp(a*(theta-b)))
# Let's imagine that we have a single test taker and that test taker has taken the first 15 items from the pool.
items.count = 15
items.taken = pool[1:items.count,]
# And that the person has a latent theta ability of 1
theta = 1.3
# Let's calculate the cut points for each of the items.
items.cut = PL3(theta, items.taken$a, items.taken$b, items.taken$c)
# We can see how the cut point works by graphing
plot(0,0, type="n", xlim=c(-3,3),ylim=c(0,1), xlab=~theta,
ylab="Probability of Correct Response", yaxs = "i", xaxs = "i" , main="Item Characteristics Curves and Ability Level")
for(i in 1:items.count) {
lines(seq(-3,3,.1),PL3(seq(-3,3,.1), items.taken$a[i], items.taken$b[i], items.taken$c[i]), lwd=2)
abline(h=items.cut[i], col="blue")
}
abline(v=theta,col="red", lwd=3)
# Now let's draw a uniform draw that we will use to calculate whether each item as passed.
rdraw = runif(items.count)
# Finally, we will calculate item responses
item.responses = 0
item.responses = items.cut > rdraw
###############################################
# Done with Simulation - Time for Estimation
# We want to use the information we know about the items (the item parameters and the responses) in order to estimate a best guess at the true ability of the test taker.
# First we must check if the person got all of the items either correct or incorrect.
sum(item.responses)
# If this is either a 0 or a number equal to the number of items then we cannot esimate an interior maximum without additional assumptions.
# We will attempt to recover our theta value using the r command optim
# First we need to specify the function to optimize over.
MLE = function(theta) sum(log((item.responses==T)*PL3(theta, items.taken$a, items.taken$b, items.taken$c) +
(item.responses==F)*(1-PL3(theta, items.taken$a, items.taken$b, items.taken$c))))
# The optimization function takes as its argument the choice variables to be optimized (theta).
# The way the above optimization works is that you specify the probility of each response piecewise.
# If the response is correct, then you count the CDF of theta up to that point as contributing to the probability of observing a correct outcome. If the response is negative, then you count it as contributing the the probability of a incorrect outcome. You then choose the theta that produces the greatest total probability.
MLEval = 0
theta.range = seq(-3,3,.1)
for(i in 1:length(theta.range)) MLEval[i] =MLE(theta.range[i])
plot(theta.range, MLEval, type="l", main="Maximim Likelihood function", xlab= ~theta, ylab="Sum of Log Likelihood")
abline(v=theta, col="blue")
# We can visually see that the maximum of the slope will not be at the true value though it will be close.
optim(0,MLE, method="Brent", lower=-6, upper=6, control=list(fnscale = -1))
abline(v=optim(0,MLE, method="Brent", lower=-6, upper=6, control=list(fnscale = -1))$par, col="red")
# We can use R's optim function to find the ideal theta that would maximize the information from this test.
# Item information is:
PL3.info = function(theta, a, b, c) a^2 *(PL3(theta,a,b,c)-c)^2/(1-c)^2 * (1-PL3(theta,a,b,c))/PL3(theta,a,b,c)
# Notice, this is not the best way of defining the test information function since the items are not arguments.
test.info = function(theta) sum(PL3.info(theta, items.taken$a, items.taken$b, items.taken$c))
# Construct a vector to hold the test information
info = 0
for(i in 1:length(theta.range)) info[i]=test.info(theta.range[i])
plot(theta.range, info, type="l", main="Information Peaks Slightly Above 0", xlab= ~theta, ylab="Information")
abline(v=theta, col="blue")
# But we want to know about the test taker at theta
optim(0,test.info, method="Brent", lower=-6, upper=6, control=list(fnscale = -1))
# The person this test would be best suited to evaluate would have an ability rating of .19
abline(v=optim(0,test.info, method="Brent", lower=-6, upper=6, control=list(fnscale = -1))$par, col="red")
To leave a comment for the author, please follow the link and comment on his blog: Econometrics by Simulation.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...




Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).