Chen Wei's readings and review

What I've learned

What I've learned

by LIU CHEN WEI -
Number of replies: 14
to be continued...surprise
In reply to LIU CHEN WEI

Generalizability theory and classical test theory

by LIU CHEN WEI -

The paper mainly introduced two different measurement theories: G theory, CTT. In later section, the author briefly compared them with IRT in many aspects. The focus is on important aspects of the theories how to describe similarities and differences between them. Then, the broadly conceived idea: reliability—was considered again within those theories, especially in G theory in order to quantify and distinguish the sources of inconsistencies in observed scores.

Firstly, the author indicated the four vital features of CTT: X = T + E that (1) T and E are all unobserved variables, (2) T is definitely not platonic true score, (3) E is not the model fit error or a residual as in traditional statistical sense, and (4) “true” and “error” are not realities to be discovered –they are investigator-specific constructions to be studied.

Next, the reliability coefficients in the CTT were explicitly explained. It is noted that the ρ(X,X’) estimates will differ overtly with respect to the data collection designs (e.g., “parallel” form difference or “test-retest” occasion difference). These distinctions cannot be well distinguished in CTT but in G theory. Then, the author explained that reliability coefficients play an important role in psychometrics due to its applications in (1) correction for attenuation and (2) standard error of measurement (SEM).

The author clarified the Cronbach’s α was not invented by Cronbach but who popularized it. The Cronbach’s α is a lower limit to reliability only correct under a set of stringent assumptions. Thus, the Cronbach’s α is more likely to be the upper limit to reliability in real-world situations.

Then, the author briefly introduced Lord’s SEM which is a kind of bridge between CTT and G theory. The Lord’s SEM uses a random sampling model to estimate σ(E*) rather than parallelism and within-person design. Thus, G theory replaced the parallelism with randomly parallel forms.

In univariate G theory, X = u + E1 + E2 +… +Eh, the variety of error are separated. Investigator must decide which source of error to be included. The u is regarded as expected value of observed scores over replications of the measurement procedure. It looks like ANOVA analysis. But the G theory focuses on variance components and their estimation. Each element in equation which is random or fixed is determined by investigator. In this way, a generalizability coefficient can be defined in Equation 12 which is analogous to reliability coefficient in CTT. When CTT and G theory are used to analyze data, it is likely that error variance will be underestimated in CTT due to only one variance considered. If the nested design is used, it is usually leads to smaller error variances. The author gave examples of this situation. Furthermore, G theory can be extended to multivariate case.

Finally, the author compared the CTT, G theory, and IRT. The main difference is that IRT focuses on distinguishing among items, whereas CTT and G theory focus on test or form scores. An overall comparison is outlined in Table 1.

Gains:

1. It says that IRT item parameter is fixed in common. However, as we know, the concept of random item has been developed recently. Similarly, Bayesian priors are also a method to turn the fixed items into random variable.

2. We should decide and choose suitable measurement theory for our data among the three. If interesting in item response, the IRT would be a good one; if interesting in test, CTT or G theory should be used.

3. G theory can handle multiple facets model well; In contrast to IRT, they also developed some IRT facet model as well.

4. Incorporate the G theory into IRT model may be a good way to integrate multiple facets.

In reply to LIU CHEN WEI

Generalizability in Item Response Modeling

by LIU CHEN WEI -

Brief summary:

Generalizability theory (GT) a large-scale analyzes method and IRT which focuses on item response function are applied in tandem called GIRM. G theory separates several “facets” (source of measurement error) such as items, raters, test forms etc. and viewed them sampled from universe (distribution). Note that the all effects are assumed independent in this study. If the less the observed score variance due to sampling instances of the item facet and its interaction, we can have more confident to generalize the measurement procedure. In GT, we can disentangle the contribution that distinct source of measurement error and predict the reliability by sample size. In IRT, the person and item parameter were all assumed randomly drawn from a population of interest. Thus, the MCMC is a good choice to produce posterior distribution for each item parameters and variance components.

In first simulation study, both GIRM and GT are applied to data simulated by GT. The results of both approach is similar, except the variance of interaction of person and item and error term cannot be estimated in GT. It is showed that the generalizability coefficients in the GIRM are very likely to be biased downward due to violating the assumption of random effects during the Markov Chain. A possible effective approach was also proposed to partially adjust for this underestimation. Further more effective solution is required in future. In second simulation, the sensitivity of prior distribution for parameter estimation was conducted. Then the sensitivity of misspecification of IRT model was also considered and showed that the GIRM approach is more robust than GT. Finally, empirical data was used under the GIRM approach.

To sum up, the GIRM approach directly provides us both GT and IRT parameter estimates. There is no need to modify the formula in the presence of missing data. And the interaction effect between person and item can be separated in the GIRM approach. Therefore, we can realize the data from broader (variance components) and specific (item response function) view simultaneously.

Questions:

1. Once item parameters were assumed to be random effect, how much information can we gain from this approach, especially the number of items is small (<10 or less)?

2. If the design of instrument is not balanced, is the GIRM approach is still robust?

Further ideas:

1. Extended to polytomous case, but deriving the link between IRT parameters and GT variance components is not developed yet.

2. The GT may extend to multi-facet case but much more complicated in model.

3. In the same way, unfolding model can be used with GT as well if large measurement error is suspected.

In reply to LIU CHEN WEI

SELECTION OF ITEM RESPONSE MODEL BY GENETIC ALGORITHM

by LIU CHEN WEI -

Brief summary:

In general, a same model is applied to a set of items and assumed that all conform to the same model. The author argued that the response process may be different for each item even if the response format were identical. Thus, it is possible to apply different model to each item. The simple genetic algorithm (GA) was employed as an exploratory method to identify each item to which model is more appropriate simultaneously. GA is a search heuristic that mimics the process of natural evolution.

The basic setups in GA are chromosome, initial population, fitness, selection, crossover and mutation. Chromosome is a vector containing the indicator variable for each item. There are total G possibilities of chromosomes. Then, the initial populations G were generated for the first step. The most important part is the criteria for fitness. The AIC, consistent AIC and Schwarz’s BIC were chosen as information criteria. There are several selection methods in GA such wheel selection, tournament selection etc. The aim of selection is to choose those individuals who have high survival rate if and only if they are of high fitness. Once promising individuals were selected out by fitness, the crossover and mutation were carried out via natural evolution. The process was iterated successively until a certain criterion is met.

In simulation study, only the GRM and GPCM was employed and used to generating item response. The EM algorithm was used for parameter estimation. The hit rate for gene (identical to true gene) and hit rate for locus (overall rate) were reported. As expected, larger number of sample size and individuals will lead to higher hit rates. The estimation of item parameter was also examined and recovered well. And hit rates for model recovery (select right model for each item) was reported as well. Finally, a real data set was analyzed by the same setup in simulation study.

Opinion and Questions:

1. The computation of the fitness is intense in IRT model due to iterations in GA.

2. How many candidates of models should be considered is arbitrary.

3. The GRM and GPCM all are the dominance models and similar to each other. Once an item was identified as GRM (or GPCM), I think what we learned is just the item response function is similar to GRM only. Nothing special. The author may just show us how the GA works in the simple case.

Further ideas:

1. More advanced GA can be used as well rather than simple GA.

2. More general model such as testlet GPCM model can be employed as a candidate model if local independence may not hold.

3. In unfolding models, some models such as GHCM and GGUM assume that a person may give “disagree” to the statement with random error even if his/her location is identical to the item location; by contrast, the PARELLA and Coombs’ unfolding model assume a person will answer “agree” to the statement absolutely while person’s location is identical to item location. Thus, some items may conform to GHCM and others conform to PARELLA. The method proposed in this paper may be useful in this case.

4. The GA may be used as an exploratory method to identify anchor item in DIF study.

In reply to LIU CHEN WEI

The Influence of Dimensionality on Parameter Estimation Accuracy in the Generalized Graded Unfolding Model

by LIU CHEN WEI -

Brief summary:

The aim of the study is to investigate the effect of the existence of extra dimension on parameter and person estimation of the unidimension GGUM. Evaluate the degree to which GGUM parameters are robust to the violation of dimensionality assumptions. The study considered a simple condition (that is, only two dimensions considered) and conducted a simulation study. What is expected is that the low-correlation dimension will lead to positive estimates biased upward and positive parameters downward, especially for extreme ends of the continuum.

Three factors were number of items determined by second attribute, correlation of two attributes, and sample size manipulated in simulation study. And the ANOVA was conducted using the three variables. The simulated parameters were similar to Roberts (2000) study and generated using SPSS. Then , the simulated responses were calibrated using software GGUM2004. The percentage of items determined by second attribute is 25% and 50%, respectively. Sample size is 500, 1000, and 2000, respectively. Interfactor correlation is .03, .3, .6, and .9, respectively. Firstly, the number of instances of singularities was summarized in Table 3. It occurred in condition of correlation .03 especially. And unreasonable large estimates were treated as missing value in following analysis. For estimation for theta, as expected, the RMSE is larger when 50% of items determined by second attribute and lower correlation of attributes. The error of estimates for theta two was larger than theta one in common. When correlation is lower, the RMSE for item parameters is larger almost.

Noticeably flat item response functions and/or unacceptable parameter estimates may be indicative of multidimensionality in scales (but not absolutely correct).

Questions:

1. The Hoffman’s model (1959) was used to generate multidimensional response. The paper is quite old one. I’m not sure it is compensatory model or not.

2. The GGUM2004 uses MML and EAP method for item and person parameter estimation. It assumes that the distribution of person is normal. Further study should consider another condition when the distribution of person is not normal.

Further ideas:

1. Only between-item condition was considered in this study. It is still not clear about the within-item condition for parameter estimation.

2. It is desirable to set up an objective criterion for justifying which item may measure the additional dimension.

In reply to LIU CHEN WEI

Assessing Personality Traits Through Response Latencies Using Item Response Theory

by LIU CHEN WEI -

Brief summary:

The author modeled the response latencies as a linear regression function. The function includes the time demand of an item, general response speed of a person, and a latent response function (e.g., two-PL Model). The model was used to inspect the inverted-U effect which indicates that faster response latencies occur when adjectives were rated as extremely unlike or like compared with neutral item. An index of information of response latencies was established to show that employing the response latencies can increasing the information of latent trait of interest.

An empirical data consisting of 171 samples was analyzed by the model. The author used three-stage estimation method here. Firstly, a Rasch model was used to fit the data in order to obtain the estimate of the latent trait (regarded as fixed value in next stage, that is, the response probability is fixed as well). The second step consisted of linear regression analysis to investigate the significant of item-specific regression weight for each item and some items of non-significant were discarded. The final step is to conduct a multilevel model for separating the individual differences in response speed and an unsystematic part due to pure error. The results indicated partly evidence of response latencies in personality questionnaire. In further study, some possible error should be considered in this model such as item length or emotional evocative character.

Questions:

1. The sample size is very small. In final stage of analysis, less and less samples were used. The goodness of data-model fit is quite suspicious.

2. No simulation study was conducted to investigate the goodness of parameter recovery.

Further ideas:

1. An one-stage parameter estimation should be developed instead of multi-stage method to separate source of error simultaneously (e.g., WinBUGS).

2. Personality scale has been analyzed by unfolding model recently. It is intuitive to develop the response latencies under unfolding model.

3. I think controversial issues such as abortion, death penalty etc. may have strong inverted-U effect.

In reply to LIU CHEN WEI

A Comparison of the LR and DFIT Frameworks of Differential Functioning Applied to the Generalized Graded Unfolding Model

by LIU CHEN WEI -

The author adopted the techniques of DIF detection in unfolding model. Two traditional parametric methods were used: one is likelihood ratio (LR) and differential functioning of items and tests (DFIT). LR is a kind of method of model comparison and DFIT is to compare group’s respective expected scores. Only simulations were conducted in this study to see if the two methods of DIF detection were useful in unfolding model. Relative sample size of the reference and focal groups, number of items containing DIF, and the source of DIF (DIF on item parameters) were manipulated in simulation. Other setting of item parameter and distribution of latent trait are all similar to previous studies.

The results indicated that the LR performed much better than DFIT in detection rates and false positive rates. As sample size increases, higher detection rates will get but not for false positive rates.

Qs:

1. F group and R group are all generated from standard normal distribution; it may be unrealistic in real situation. This is perhaps that the software GGUM2004 does not allow user to set constraints freely on parameter estimation.

2. No anchor items were assumed in this study and no purification methods were used.

3. No empirical data were demonstrated in this study. We are in the same boat!

4. The DFIT requires to link item and person parameter for the focal group on the scale of the reference group to facilitate comparison. Doesn’t it require common items?

5. All I have seen is that the author just compared LR and DFIT methods in unfolding model. And then it is argued that DFIT was not appropriate to unfolding mechanism because the DFIT is related to sum scores (i.e., monotone mechanism). The results finally gave insight to this point. As a result, the LR seems effective for unfolding model.

6. It is argued that the raw score method like MH method is not suitable to unfolding model. It seems the only way is that we distinguish positive person and negative person via latent trait?

7. It is strange for me why the detection rates is much lower when there is DIF for slope parameter in all cases? No explanation were given by authors.

8. We can conduct anchor method and purification method in unfolding model for empirical application.

In reply to LIU CHEN WEI

explanatory secondary dimension modling of latent differential item functioning

by LIU CHEN WEI -

paper here

The authors developed some mixture one- or two-dimensional IRT model for detecting DIF items. The main idea is to contribute those DIF source as another dimension differing from primary dimension (i.e., ability). The nuisance dimension is assumed existing only in the DIF class but not in non-DIF class. That is we can explain this nuisance dimension as random effect or other skills helping solving problem not related to ability. Although it may be as a result of local dependence that leads to the nuisance dimension. The proposed model can be used as exploratory or confirmatory way. If we have no idea about the latent class of those respondents, the proposed model is exploratory. On the other hand, it can be confirmatory if we know the variable of each person (e.g., gender).

Seven different mixture, multi-dimensional or mixed IRT models were used to fit two real data. The AIC and BIC were used for model comparison. The results showed that the proposed model (two dimension used in DIF class only) got better model-data fit. For explaining the source of DIF, the first example shows that the speediness can be regarded as a source of DIF. In the second example, the arithmetic operation required by item attributes the source of DIF.

In which the occurring of nuisance dimension should be in real data analysis? I think it may depend on the respondent’s approach or previous knowledge. Thus, we can get reasonable sense to have an explanation for the source of DIF. In worst situation, it may be a hard task to decide.

In reply to LIU CHEN WEI

Analyzing ipsative data in psychological research

by LIU CHEN WEI -

The paper introduced and differentiated three kinds of ipsative data analyzed by factor analytic method. The first one is call additive ipsative data (AID), which was used for reduce response set bias such as acquiescence and social desirability. It transforms the Likert scale data into ipsatized data by standardizing individual raw scores. The sum of scores is therefore same for each person. But methods for analyzing such data are limited. The second type is called multiplicative or compositional ipsative data (MID). When asking respondent indicating the expenditure of his own, only percentage was measured. So the sum of percentage is one for everyone. It may eliminate the bias when items are sensitive or personal. It also can avoid response set bias but hard to uncover in practice. The third one called ordinal ipsative data (OID) including ranking objects. It can effectively control the additive and multiplicative response set bias. The author suggested that choosing the ipsative data to analyze would be better if the normative measures were contaminated or not available.

Factor analytic method (PCA) was employed and assessed to analyze three kinds of data in simulation study. The solutions to all kinds of ipsative data all failed to recover the factor structure and leaded to bipolar factors.

Then, having the structure of additive ipsative data (AID), the raw score y was linked to a confirmatory factor analytic model. The AID model was reparameterized via average like design matrix (Chan and Bentler, 1993). The new AID model can be readily analyzed by standard CFA. Note that one variable may be deleted for avoiding singular covariance matrix. The results showed that the AID model can be recovered well. And the AID model has been extended to have partially additive ipsative data when normative and ipsative measures are combined (Chan and Bentler, 1996).

For recovery of OID model, the OID model was also transformed via design matrix. The loading matrix and the error covariance matrix were changed and have necessary constraint between elements within matrix as well. The results showed that the estimate of parameters can be recovered well via PML with two stages algorithm.

The real data from Chinese Personality Assessment Inventory (CPAI) was demonstrated. Two factors and 1585 adult participants were selected to analyze. Data was converted into AID. Results found that AID model in general fits better than normative model.

Comments:

1. Only general factor structure was considered to recover, not the latent trait of individual.

2. No negative item was considered in those models.

3. Partially ipsative data and normative data should be developed to focus on the estimate of latent trait. Their cited paper (Chan and Bentler, 1996) may be a start with forced-choice IRT model.

In reply to LIU CHEN WEI

item response modeling of paired comparison and ranking data

by LIU CHEN WEI -
This article demonstrates the pair comparison under the framework of Thurstonsian IRT and its application in real data. Only unidimensional latent trait was modeled. All the design of this model is similar to the forced-choice multidimensional IRT model (Brown & Maydeu-Olivares, 2011). The difference is that an error term modeled for intransitive response, inconsistent response in ranking objects. The issue of identification for Thurstonsian factor model and Thurstonsian IRT model was discussed as well. Different choice of constraints would preserve the same estimate of intercept and slope parameters. Limited information method was used for estimation and carried out by Mplus.

In simulation study, the mixed keyed items were considered only. It shows that a sample size of 1000 observations is enough accurate when 6 items are used. Interesting results is that the intercepts and slopes are estimated accurately in all conditions even when the residual variances of pair items are poorly estimated. The ignorance of local dependencies in latent trait estimation would make a bit effect.

The remarks are that the varying discrimination of pair items will result in large difference between latent traits, especially in mixed items. This is why the latent trait can be located on underlying continuum.

Comments:
1. The use of slope parameter seems being effective in estimating the location of person. But the requirement may be inappropriate. We have no strong reason to believe that the slope parameters between pair items would have a large difference. So, if without the slope parameters, someone’s trait score are 2 and 1 and another someone’s trait score are 1 and 0 all will lead to the same probability in selecting objects. And it is known that the slope parameter may be contributed to randomness.
2. Add rating scale items may be helpful. But firstly it needs to explain the scale between rating scale and ipsative is the same. And acquiescence cannot be avoided in rating scale in practice.
3. Beside those above, it is worthy giving a try to add the rating scale into ipsative scale for the estimation of latent trait.
In reply to LIU CHEN WEI

Hyperbolic_Cosine_Latent_Trait_Models_for_Unfolding_Direct_Responses_and_Pairwise

by LIU CHEN WEI -

Hyperbolic cosine model (HCM) is a kind of IRT unfolding model scoring dichotomously. It can be used to construct a model for pairwise preferences as well. The HCM was derived from general form of Rasch model. With three categories, say, 0, 1, and 2, we can consider the category 0 and category 2 as Disagree response but not distinguishable, and category 1 reflect Agree response. The parameter estimation method is JML. A two-stage estimation was carried out. A single parameter ‘latitude of acceptance’ was estimated firstly, then the unit parameter of each item were freely estimated but their mean was constrainted to a mean equal to the estimate of latitude of acceptance on previous stage. The test of fit is formalized as a Person chi-squre statistic. Finally, an empirical data was analyzed by the HCM.

When pair-comparison was governed by the unfolding response process, it is often termed pairwise preference. The model for pairwise preference assumes a latent trait of person and two location parameters for respective items. Four possible response patterns are (0,0), (0,1), (1,0), and (1,1). The probability of these four responses is a multiplication of independent probability of HCM, respectively. The problem is that it is indistinguishable when the response is (0,0) or (1,1) of the person. The model assumes that the person will consider the pair again, then leads to (0,1) or (1,0). Thus, the responses of (0,0) and (1,1) were neglected. The model has probability of .5 when trait is at the midpoint ofδi andδj or δi =δj. Strong stochastic transitivity is held in this model. The parameter estimation is JML as well. The empirical data was analyzed again by using the HCM for pairwise preference.

The data analyzed by a direct response type and a pairwise preference type. Their scale values show high correlation between latent traits and provide similar results.

Q:

1. For the probability of (0,0) and (1,1), they are formulated in different form, meaning their probability will not be the same when the two persons of the same trait. Although, the two functions were not used. Note that the function of (1,1) is not related to latent trait.

2. If we can observe the behavior of reconsideration when the base response of the person is (0,0) and (1,1), it is possible to solve the problem in ipsative data. After reconsideration, the final probability of (1,0) may be a combination of (1,0) and (0,0) or (1,0) and (1,1).

In reply to LIU CHEN WEI

The Extra-Factor Phenomenon Revisited Unidimensional Unfolding as Quadratic Factor Analysis

by LIU CHEN WEI -

The paper demonstrates why that the data generated from unidimensional unfolding model are analyzed by linear factor analysis would lead to an extra spurious factors. Throughout the manipulation of linear and quadratic factor model, it shows that the unidimensional unfolding model can be equivalent to a quadratic factor model mathematically. So, how to distinguish the item conforming to 2D linear factor model or unidimensional unfolding model? The conditions are considered: factor score distribution, covariance and correlation matrix, factor loadings, and partial correlation. It shows that it can be distinguished under the condition of factor score distribution if the unfolding model is fallible (i.e., containing random error). The simulation study shows that the parameters can be estimated well as the residual variance is small and sample size is not small. And a scatterplot of Barlett variates shows a parabola curve.

Comments:

1. I need to review some textbook about multivariate analysis. Its mathematical formula is complex and a bit difficult.

2. It only focuses on the much simplest unidimensional unfolding model. What we learn is that the unfolding data should not be analyzed by linear factor analysis.

3. We have to find out the criteria to help us distinguish if an item is unfolding or not if IRT unfolding models are used.

4. When multidimensional unfolding data are analyzed, we should firstly use exploratory techniques to distinguish its structure. However, I’m looking for possible methods in the context of multidimensional scaling.

In reply to LIU CHEN WEI

Nonlinear Principal Components Analysis Introduction_introduction and application

by LIU CHEN WEI -

The article explains the use of nonlinear principal components analysis (NPCA) applied in nominal and ordinal scale items as well as numeric levels. The term “nonlinear” means the analysis level (i.e., nominal or other else) of variables were optimal scaled during iteration of traditional linear PCA. That is, the values of categories were re-specified numeric value by optimal scaling. The maximum proportion of variance-accounted-for (VAF) was found out when the first p eigenvalues were calculated by the correlation matrix of optimal scores in NPCA analysis. No matter what the level analysis of each variable, the categories of the variable were specified optimal scores during iteration until the VAF was maximized. The transformations of the score of categories were step or smooth function. The nominal variables were used and the transformation is irregular, or the original categories cannot be put in any meaningful order, the multiple nominal quantification is required.

The difference of PCA and NPCA is the optimal scaling was estimated and the PCA was conducted iterated in turn. They are similar in objective, method, results, and interpretation. Two exceptions is the optimal scaling for categories and the components derived from NPCA are not nested. The nestedness means the maximization of NPCA was for the first p components. But for PCA, the maximization is consecutive and simultaneous throughout the all components. Choosing the appropriate number of components is to see the scree plot traditionally. But the parallel analysis is alternative.

Finally, a real data (including nominal variable) was analyzed by PCA and NPCA. It shows the VAF derived from NPCA is larger than that from PCA and two components were retained by scree plot.

Comments:

1. It seems it only processes the categories of variables optimally. However, the cumulative property of categories of variable is not hold in unfolding process. Persons with different latent trait cannot be distinguished by optimal scaling method. I’m searching for more appropriate method (other than MDS) of distinguishing the items from different dimensionality in between-item condition.

In reply to LIU CHEN WEI

spectral methods for dimensionality reduction

by LIU CHEN WEI -

A few representative spectral methods for dimensionality reduction have been developed for decades. Its aim is to reduce high dimensional data into low dimensional structure for discovering vital traits of data. The classic method principal components analysis (PCA) and metric multidimensional scaling (MDS) are linear dimensionality reduction. The PCA considers the covariance matrix of data and the MDS is to preserve the inner products between different (overall) input patterns. But their ability of reducing dimensional data and obtained results is similar. Isomap is trying to preserve the pairwise distances between input patterns as measured along the submanifold from which they were sampled. It can be regarded as a variant of MDS. Maximum variance unfolding (MVU) is trying to preserve the distances and angles between nearby input patterns and then maximizes the variance of the outputs. Locally linear embedding (LLE) is to construct a directed graph whose edges indicate nearest neighbor relations (locally). Those small linear ‘patches’ are on a low dimensional submanifold. Then a cost function was minimized for the target dimensionality. Laplacian eigenmaps is trying to preserve proximity relations. Kernel PCA is trying to map the data into higher dimension (feature space) and decompose the covariance matrix of kernel matrix (data in higher dimension). But the question of how to choose appropriate kernel function depends on different purposes. Above methods can be regarded as instances of kernel PCA.

1. With analyzing uni-dimensional data generated from PCM or unfolding model, the isomap and MVU seems be able to effectively judge the uni-dimensional structure of data. But it may be not effective with multidimensional data.

2. The multidimensional data may be overly mapped into lower dimension. The original structure of data may be not preserved due to those methods are trying to preserve local structure rather than global structure.

3. For discriminate the structure of multidimensional data, especially under unfolding model, it is somewhat difficult. We need to know the nonlinear structure of it and use the method which can preserve the structure and map it into an appropriate manifold. I’m still looking for.

In reply to LIU CHEN WEI

Parallel analysis with unidimensional binary data

by LIU CHEN WEI -

The aim of the paper was investigating if the factor can be well identified by using the parallel analysis (PA) on 2-point unidimensional IRT items. In simulation study, 8 and 20 items, the slope parameter, location of item, and the form of correlation matrix were four independent variables. The phi and tetrachoric correlations were both used because all the indicators were dichotomous. The random data were generated 1000. The total replication of experiments was 500. Note that there is no non-Gramian matrix was obtained when sample size is 500 or 1000.

Overall, the factor loading had the greatest impact on PA. Sample size affected the PA on phi correlation and location of items affected the PA on tetrachoric correlation. Results also shows that the larger sample size, higher factor loading, and the closer the proportions responding to two categories would result in good performance of PA. And the 95th- or 99th-percentile criteria yielded good results than mean criterion.

1、It based on factor analysis, so it is a model-based approach. By the way, it analyzed the data generated from the true model.

2、It is restricted to dominance model, so the linear factor analysis is a good choice.

3、A appropriate method for analyzing dominance model and unfolding model have to be proposed before PA is used.