Characterizing Sources of Uncertainty in Item Response Theory Scale Scores
In computerized adaptive testing, item parameters are assumed to be true and person estimates will be computed based on aforementioned condition, however, they are calibrated in reality. Person estimates and their standard error would be misestimated due to the fact that we don’t consider uncertainty from previous calibration of item parameters. In this paper, authors adopted procedure of multiple imputation to correct standard error, however results illustrated slight improvement from the table 4.
Moreover, one question is that I didn’t understand the authors’ arguments of last paragraph on page 285 about table 5: “as larger calibration samples yield more precise estimates of the item parameters.” It means that the more precise estimates of the item parameters, the smaller the value would be. How come the average relative increase in variance (r) decreases as the size of the calibration sample increases?
Indeed, if standard error is misestimated and termination of fix standard is conducted in CAT, the person measure and its standard would be misestimated. However, I don’t think just three items are not appropriate if we consider to adopt such procedure in CAT context.