Modeling Achievement Trajectories When Attrition Is Informative
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS published online 12
Betsy J. Feldman and Sophia Rabe-Hesketh
The background is that once longitudinal data are used, missed assessments and attrition are inevitable,
Then inappropriate handling of missing data can lead to biased results. The models for growth and three missing data mechanisms and approaches for dealing with missing data were introduced.
Two empirical studies are conducted to investigate the potential consequences of incorrectly handling missing data.
The first compared the effects of different treatments of missing data when modeling trajectories of reading
achievement in the National Education Longitudinal Study of 1988 (NELS:88) data set. The second was a simulation study in which data were generated according to a process by which dropout depends on the random coefficients. Mplus 5.21 was used for analysis.
two sample sizes (300 and 1,000),
two percentages of missing data (10% and 40% dropout by the last time point)
two levels of dependence (weak and strong) of the drop-out process on the intercept
and slope.
The coefficients for the intercept and slope were set to -0.1 and -0.2 for weak dependence and -0.5 and -1.4 for strong dependence.
Simulation results showed that incorrectly assuming MAR leads to greater bias for the growth-factor variance–covariance matrix than for the growth factor means, the former being severe with as little as 10% missing data and the latter with 40% missing data when departure from MAR is strong.
Comments:
More situations can be done by adjusting the missing percentage and other conditions.
Before you model some empirical studies, you should define the types of missing data first. Then you can avoid biased results by using improper model to deal with the data. There are two problems. The first one is that how can you be sure about your decision of the process of missing data is right.
However, the problem is that the assumption underlies all NMAR models for example, the author said In addition to the assumption that the latent variables are normally distributed and that, conditional on latent variables, the outcome and drop-out processes (and their indicators) are independent, consistent estimation requires that the model for dropout is correct. In the real situation, it is hard to include all relevant covariates.