09 IRT for force choice items (present by Chenwei)

NIcky's

NIcky's

by LI XIAOMIN -
Number of replies: 0

This study introduced the multidimensional IRT model based on Thurstone’s framework. Thurstonian IRT model is suitable for data from any foced-choice questionnaire composed of items fitting the dominance response model. Forced-choice items overcome some shortcomings of the traditional rating scales, as the FC items can eliminate uniform biases such as acquiescence responding and can increase operational validity by reducing halo effects.

Thurstonian factor model is a second-order factor model initially used for analyzing comparative data. This model assumed a)each item elicits a utility as a result of a discriminal process, b)respondents choose the item with the largest utility value at the moment of comparison, and c)the utility is an unobserved variable and is normally distributed in the population of respondents.

Thurstonian IRT model is a first-order reparameterized model. There are no latent utilities in this model, and the traits are linked directly to the latent response variables underlying the binary outcomes. Residual error variances of the latent response variables could be estimated, enabling also the estimation of the latent trait, which is impossible for the Thurstonian factor model. Moreover, this model improves the computational speed.

Simulation 1 was designed to illustrate the performance of the Thurstonian IRT model with the simplest data structure. Results showed that, when items were consisted of both positive and negative items, as few as 12 item pairs could provide accurate estimation and acceptable model-data fit. More item pairs should be added if higher precision rate was required. Items all keyed in the same direction is not recommended for FC design. As this kind of design could only provide information on the differences between traits, then the absolute trait scores would not able to be located.

Simulation 2 was used to investigate the effect of block sizes for more complicated data structure (5 traits). Results showed that, a) increasing the number of binary outcomes leading to a higher measurement precision, b) both positively and negatively keyed items should be combined in blocks, and c) less item is required if the block sizes are larger.

An empirical application was conducted to compare the models of single-stimulus format and the forced-choice format. The normal ogive graded response model provided poor model fit, but higher reliabilities, along with more information, when compared with the FC model.

Finally, this study listed some important factors related to the FC model, such as a) keyed direction of the items, b) number of traits, c) correlation between traits, and d) block size. In short, items with both positive and negative directions should be included, more traits would provide more precise results and release the requirements of other factors (e.g., direction of the items), and larger blocks could provide more information.