This study compared various computational methods for missing data in detecting non-uniform DIF with different types of missing data. Results showed complex relationships between the manipulated factors, and no single method was superior across all conditions.
Questions:
1. In Table 1, with the increasing of missing data percentage, why some missing data method showed larger type I error rates combined with some types of the missing data? For example, “MI” method showed increasing type I error with “MAR2”. Moreover, why “MI” was then decreasing the type I error rate with “MNAR”?
2. In Table 3, why the power is decreasing as the missing data percentage increasing for the “Complete” data?
But the power tends to decrease as missing data percentage increases with all other computational methods?
3. For the four selected methods dealing with missing data, say, LD, ZI, MI, and SRI. They are getting more complex and computational intense, I supposed the detection of DIF would be better to show values of the more complex methods. However, the results did not provide supportive evidence for the complex method, so I don’t understand what are the advantages for MI and SRI?