The paper considers several missing mechanisms on nonuniform-DIF detect, such as MCAR, MAR, and MNAR. At same time, four methods (e.g., LD, ZI, MI, and SRI) for dealing with missing data were conducted to investigate Type one error and power in DIF analysis. The results of simulations show the LD performs well under each DIF-detection methods.
1. In the condition of MNAR, more missing data were presented with person of low ability. It will lead to higher bias estimates of item parameters. So the LD and Incorrect performs well, even better than using complete data.
2. In table 2, the LR and the crossing SIBTEST performs worse than IRTLR when sample size increases. No idea about why it happened.
3. LD may have problem when all respondents presenting missing data, but it performs better than other methods. The next choice may be the MI.
4. Table 2 (Type one error) and table 3 (power) cannot be compared under identical condition.
5. Figure 1 and 2 did make me more exhausted to read.