The authors concluded that the best performance was obtained by use of an unstructed (UN) covariance matrix by two reasons: (1) No convergence problems were encounter when using UN specification. (2) Type I error were generally accurate when sample size is sufficient (200 or even 100).
The reasons are not convincing to me:
(1) If convergence problem is important, how about the convergence of ARMA(1,1), AR(1) or VC? The authors did not provide the result. I’m wondering there may be no such kind of problem for these specifications at all.
(2) Yes, it is true. But I would like to argue: (a) when the sample size is sufficient, the ARMA(1,1) and AR(1) can perform as good as (even better) than UN in estimating the fixed effects. (b) when the sample size is small, the Type I error was inflated under UN but is still at liberal definition of robustness under ARMA(1,1), AR(1,1) and even for VC. (3) When the sample size is sufficient, the estimation of variance component under ARMA(1,1) which is the correct specification is truly not bad. But I did not find the result for UN. So, why the authors believed that the UN performs better than the ARMA(1,1)?
I think the best performance should be the correct specification with large sample. Also, using AIC to select correct model is good choice. But again, a large sample size was necessary.
From the present paper, one conclusion was impressive to me: For growth curve modeling, the application of variance component should be cautious.
One question remains:
It seemed that the small sample may cause problem in multilevel modeling. But, a sample as small as 30 for conducting multilevel analysis is not practical. Also, a small sample may not sufficient to find the school effectiveness. Even a sample as large as 200 as indicate in the present study, the school effect may not significant if the difference between schools are trivial. Therefore, an additional factor which represents the proportion of school variance in the total variance should be manipulated.