ARC Laboratory Sharing: xiaoxue‘s review

53 Effects of multiple testing adjustment in DIF detection (Present by Jacob)

xiaoxue‘s review

Effect of Multiple Testing Adjustment in Differential Item Functioning Detection

Jihye Kim and T. C. Oshima

The main goal of this study is to investigate the effect of adjustment procedures for multiple testing in the context of DIF studies.

Four methods: the Mantel–Haenszel (MH) method (Holland & Thayer, 1988), the logistic regression (LR) procedure (Swaminathan & Rogers, 1990), the Differential Functioning Item and Test (DFIT) framework (Raju, Linden, & Fleer, 1995), and

Lord’s chi-square test

Three adjustment procedures: the Bonferroni correction, Holm’s procedure, and the BH false discovery rate

Sample size: 1000/1000; 500/500

DIF Items:3/20;6/40;

The type I error and power were computed.

The results show that MH and LR benefited from Holm’s or BH’s adjustment procedures at all test lengths and sample sizes considered in this study, while IRT-based procedures did not benefit from the adjustment procedures as the inflation of Type I errors was not observed under conditions in this study.

Comments:

The simulation study use IRT model to simulate the data, which may explain why the inflation of Type I errors was not observed when using IRT-based procedures. If we use 3PL to simulate the data then using 2pl or Rasch to analyze DIF, the effect of adjustment procedure may occur.