The authors considered that previous studies never estimate model parameters and person ability accounting for the influence of local item dependency and local person dependence simultaneously and therefore proposed a four-level model to compare its performance to the Rasch model, the Rasch testlet model, and the three-level Rasch model in parameter estimate, bias, standard error and root mean squared error (RMSE). This study manipulated 16 conditions and analyzed one empirical data to support the outperforming of the proposed model. However, I have some confusion for this paper:
1) Rasch model focuses on item-level information and assumes sampling free. Why do we still need to control the influence of sampling?
l Sandy has provided some insights to this concern. For example, students having much language training performed better in mathematical questions requiring language efficiency. We need to remove the influence of language levels from “clusters” in order to obtain a “pure” estimate of their math ability.
2) Usually international or national datasets provide person weighting, cluster weighting, strata weighting variables to compensate the influence of sampling. Does this complicated model work better in the estimation of item and person ability than a simple model with these weighting variables?
3) A matrix sampling is usually used in large-scale assessments. Is it possible that we can import testlet to the matrix and then estimate item and person ability? It might be a good alternative to this study.