Multidimensional computer-adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT. To date, many item selection methods were proposed in the field of MCAT and all of them can be worked well. One new item selection method was proposed in this study and the performance was evaluated with five previous proposed methods (minimum angle, volume, minimize the error variance of the linear combination, Kullback-Leibler information, and general dimension method) in MCAT. The performances of the item selection methods were evaluated in manipulating the structure of item pools, the population distribution of the examinees, the test length, and the content area. The evaluate indices were absolute bias, correlation, test reliability, time used, and item usage. The results shown that volume and minimum angle performed similarly with a high precision for both domain and overall scores when selecting items with the required number of items for each domain. The new item selection method has the highest percentage of item usage. Moreover, for the overall score, it produces similar or even better results compared to those from the method that selects items favoring the general dimension using the general model; the general method has low precision for the domain scores.
Comments & Questions
1. There is a typo in the last sentence of the upper paragraph in page 514. It should be revised as … for test length of 18 and 36 instead of 18 and 35.
2. In figure 3, V2 procedure in content=0 has the lowest correlation and absolute bias, and the highest test reliability. How to explanation this? Because the lower the correlation or absolute bias can be explained as ability estimates were less accurately; but I expect that its test reliability is the lowest not highest.