T he DIF-free-then-DIF (DFTD) strategy was introduced as a purification procedure for DIF study .
The steps of this strategy is first to select a set of items that are the most likely to be DIF-free, then to assess the other item s for DIF using the designated items as anchors. Two simulation studies were used to evaluate the performances of th e rank-based method (RB) and the rank-based scale purification (RB-S) method for dichotomous items.
The computer software IRTLRDIF used was used in this study .
Simulation study 1 Situations:
Method: RB and RB-S
Reference group: q ~ N(0, 1)
Focal group: q ~ N(0, 1)/ q ~ N(-1, 1)
Sample size: small :R250/F250)
median :R500/F500
large :R1000/F1000
Test length:20/40
Percentage of DIF: 10%, 20%, 30%, and 40%;
DIF patterns:
Constant: all of the DIF items favoring the reference group
Balanced: half of the DIF items favoring the reference group and the other half favoring the focal group
One hundred replications were made under each condition.
The average signed area (ASA) was used to depict the average degree that a test favors the reference group: The test as a whole favors the reference group when the ASA is positive, the focal group when it is negative, and neither group when it is 0.
The efficiency gain (EG) of the RB-S method over the RB method on the rate of accuracy can be computed. A positive EG indicated the RB-S method was more efficient than the RB method.
The Type I error rates and power rates of DIF assessment were also compared for the two methods.
The result of the first simulation shows that the RB-S method outperformed the RB method in locating clean anchors and yielding a well-controlled Type I error rate and a high power rate.
The second simulation compared the RB-S method with the other three traditional methods (AOI, AOI-S, PA).
Simulation study 2 Situations:
Method: RB-S, AOI, AOI-S, and PA
Impact: 0 and 1;
Sample size was set at R500/F500
Test length: 20 items
Percentage of DIF items: 10%, 20%, 30%, and 40%
DIF patterns: constant and balanced.
The results shows that the RB-S and PA methods yielded power rates lower than those of the AOI and AOI-S methods
Comments:
The method is very useful and attractive which can provide more accurate information about the items for test developer or researchers who interested in this area. If there are some empirical studies, it will be perfect. Because we do not know what the DIF will be in real data.
I have been thinking about the logic of DIF for some time. Since the assumption underlines DIF is that the different group should have the same ability, does it means in an empirical study the first step we conduct is to make sure the ability is the same before we do DIF analysis? Or if the ability is different, there is no need to go further? However to estimate the ability we need to choose the right model, we need the item parameter to do this thing. If some of the items contain DIF, it will affect the model we choose, which seems that we fall into the causal loop.