Nicky's readings and review

11/11/2011 Higher-Order IRT with small sample size

11/11/2011 Higher-Order IRT with small sample size

LI XIAOMIN -
回帖数:0

This study proposed a higher-order item response theory (HO-IRT) model to achieve more precise parameter estimation with small sample size. This HO-IRT model is a multi-unidimensional model that uses in-test collateral information to improve parameter estimation.

Simulation Design:

1) Factors to be controlled

Sample size: 500/1000 persons

Numbers of Domains: 2/4

Numbers of items in each domain: 10/20

Correlation between domains ρ: .5/ .7/ .9

There are totally 2*2*2*3=24 conditions. 25 data sets were generated for each condition.

2) Three-parameter logistic (3PL) model is employed

3) Item parameter

10 3PL items were selected from a pool of 550 math items that were calibrated using a nationally representative sample. The same tests were used across the different domains.

4) Person parameter

Overall ability ~N(0,1). Regression coefficient derived from the correlation between abilities was = λ = , so the domain abilities were sampled from N(λ , 1- ).

5) Markov Chain Monte Carlo (MCMC) used to conduct estimation.

6) Criterion for testing the performance of HO_IRT

The conventional unidimensional IRT (CU-IRT) was also employed to analyze the same simulated data, in order to test whether HO-IRT could improve item parameter estimation.

Item-Level: Root mean square error (RMSE), root mean square difference (RMSD) between the estimated and true ICCs, and the difference between true and estimated test characteristic curves (TCCs) were used to compare the performance of HO-IRT and CU-IRT.

Person-Level: correlation between true and estimated abilities, and RMSE were used for comparison.

Brief summary of results:

For item parameter: HO-IRT performed better or similar as CU-IRT in various conditions. Improvements using HO-IRT were more evident with shorter test, fewer examinees, and higher dimensions. Correlation between domains did not show obvious effect.

For person parameter: sample size did not affect the ability estimation using either method. HO-IRT gained larger improvements from shorter tests and higher dimensionality. Correlation between domains had a clear and systematic impact on ability estimation.

HO-IRT is advisable in estimation when sample size was relatively smaller and the test was relatively shorter. HO-IRT is useful in practical content when considering issues as item exposure. However, the interpretation of the HO-IRT estimates is complex, as it borrows information from other subtests. More attention must be paid to the appropriateness of the use. In addition, they recommended that applications that require unequivocal interpretation of test scores should obtain ability estimates that do not rely on collateral information.

Future studies:

1) apply more complex relationships to the ability structure.

2) factors hold constant in this study could be varied.

3) within-item multidimensionality.

4) extend to polytomous items.