The mixture model does not take the multilevel data structure into account. The present paper attempts to incorporate the multilevel structure into a mixture IRT model and extend the model into a multilevel mixture IRT model (MMixIRTM).
The proposed MMixIRTM has latent class at both of student level and school level. Therefore, a subscript k is added into the mixture model. In each combination of latent class of student level and school level, a Rasch model is assumed to hold, but each class may have different item difficulty parameters, which is treated as DIF.
Three situations were concerned. The first is latent class lines in both of student level and school level. So that the item and ability parameters vary across the class. It is the most complicate model. The second is that latent class is from student level. Hence, only the item and ability parameters vary in student level while they do not vary across school-level class. The third is that latent class is from school level. Therefore, only the item and ability parameters vary in school level while they do not vary across student level class.
Two issues I’m interested in are label switching and model selection. Label switching occurs when latent classes change meaning over the estimation chain in different replications. It is hard to handle in mixture model. My strategy, just as the present study did, is to investigate item difficulty and ability patterns for each school latent class and decide the dominant group. However, it becomes massive when there are many replications. Model selection is related to how many classes should we decide in mixture models. They are two approach: (1) we may study the fit of the model with different numbers of latent classes sequentially, and compare the LR. (2) Use statistical criteria, such as AIC, BIC to decide which is the most fit model. The second method seems to be simple and easy implemented.
Comment:
1. Will be interaction of student level variable and school level variable that caused DIF?
2. Future study could compare the performance of detecting DIF for multilevel mixture IRT model and mixture IRT model which does not take the hierarchical data structure into account, when the hierarchical data structure exits.
3. The paper is long and is not well-written.