This study focused on the effect of vertical scaling methods on the small sample size and less test length. There are two main kinds of vertical scaling methods, concurrent calibration and separation calibration. Concurrent calibration performed generally small error and more stable estimates. However, those calibration with the small sample size and short test length was examined fewer. And few studies investigated the effect of incorrect model, there just is one study mentioned this topic (Lord, 1983). Therefore, the author investigated the effect of both of those factors, and examine the interaction of those factors. I think the model in this study is: IRT_theta=B0+B1*Growth. Author evaluated the recovery of growth model by RMSE and bias of B0, B1, Var(B0), Var (B1), and Cov(B0, B1).
The result of this study corresponded with the previous studies. We can expect the concurrent calibration was not affected by test length, and the concurrent calibration performed better advantage when test length is short and sample size is small. Using the incorrect simple model (true 2PL --> estimate 1PL) was not affected much cross discrimination parameter in this study design is very small. Finally, Author suggested the optimal sample size is at least 250 per form.
1) It is odds to me that the one value of RMSE and Bias just can be calculated over 100 replication. Therefore, each cell in ANOVA should be only filled one value. How could the table 2 be computed?
2) Although many results were similar with previous studies, the theoretical explaining of those results should be more rather than ignoring the explaining. For example, why did the concurrent calibration generally underestimate the growth rate and separation calibration overestimate it?
3) I don't understand clearly those separation methods, such as MM, MS, H, and S-L, so I can't expect the effect of those methods.
4) Author designed 1PL as the incorrect model and 2PL as the true model. Although the different of the effects of them is small in her result, she still can manipulate the variance of discrimination parameter as a factor.