Xiaoxue's reading and review

Use of multilevel logistic regression to identify the causes of DIF

Use of multilevel logistic regression to identify the causes of DIF

KUANG XIAOXUE發表於
Number of replies: 0

Use of multilevel logistic regression to identify the causes of differential item functioning

Nekane Balluerka, Arantxa Gorostiaga, Juana Gómez-Benito*

and María Dolores Hidalgo*

Many methods are developed for detecting DIF, however little progress has been made to explain why DIF occurs.

Nowadays, three approaches are used for this problem. The first one proposed explanation for why DIF occurs is based on the principle of multidimensionality (Ackerman, 1992). From this perspective an item presents DIF because some of its characteristics are not relevant to the trait or latent ability of interest. The second one is to apply the structural equation models known as multiple indicators multiple causes (MIMIC) and proposed by Muthén (1989) as a way not only of detecting DIF.

And the last approach is to use Multi-level models.

Two-level models are introduced which enable the progressive incorporation of item characteristics so as to explain the variation in item responses that is due to DIF.

The difference between 2-level model and the common logistic regression models is about the level-2 model.

In the level-2 models (item level) the regression coefficients from the level-1 models, which include the coefficient that represents each item’s DIF, are treated as random variables whose variation could be predicted by certain characteristics of the items.

One of the most important features that distinguishes this approach from traditional procedures for detecting DIF is that it formulates DIF as a random parameter, which in addition to optimising its estimation, enables information to be obtained regarding its causes

The formulation of level-1 model:

Logit[P(Yij =1)]=boj+b1j *Hi +b2j *Gi

Where

 Yij is the score obtained by subject i on item j(1=correct

response,0=incorrect response);

Hi indicates the ability level of subject i on the attribute or variable measured by the test;

Gi is a dummy variable that indicates whether a person belongs to the group of interest or focal group(Gi=1), or to a group with which the latter is compared, i.e. the reference group (Gi= 0)

boj reflects the log of the odds for the item difficulty in the reference group

b1j stands for the item discrimination or the ability of the item to discriminate between subjects with high and low scores on the attribute measured by the test

b2j denotes the deviation in the item difficulty in the focal group with respect to the reference group, in other words, the parameter of uniform DIF.

level-2 model:

boj=g00+uoj

b1j =g10+u1j

b2j =g20+g21*I1+g22*I2 +…+g2n*In+ u2j

gQ0 s are the means of the level-1 regression coefficients

g00 is the mean of the item difficulty values in the reference group

g10 is the mean of the item discrimination values

g20 is the mean o of the deviation in item difficulty values between the focal and reference groups or the overall DIF parameter for item j

UQj are random variables that represent unexplained variability

uoj is the variability shown by items in terms of level of difficulty

u1j is the variability among items as regards the level of discrimination

u2j denotes the variability among items in the DIF index or the unexplained variation in DIF for item j after taking into consideration its characteristics

I1, …, In are dummy or interval variables that reflect item characteristics. γ2n is the last parameter or the coefficient associated with the nth characteristic of the item that predicts the variation in DIF.

The simulation study was conducted to show the performance of the logistic regression for analyzing the causes of DIF

The HLM6 software is used in this study.

Comments:

The approach is a promising one. The author listed the tables with explanations which are benefit for us to understand and to use in our own research.

As the author says, the method here can be expanded to more levels and more complex models, which is appealing to me. What I am concerned about is the complexity once it becomes so complicated. The instrument should be improved too which will limit its use in some extant. However it is really a promising method.