Can Subject Matter Experts’ Ratings of Statement Extremity be used to Streamline the Development of Unidimensional Pairwise Preference Scales?
The noncognitive testing has been widely used in organization field. However, the amount of items in a pool sometimes is not many. In such case, pairwise preference format can enlarge items by mixing two different statements. This paper focus on three advances on efficiency: a) using matter experts’ (SME) ratings in place of MML estimation would not result in too biased estimates recovery, resolve estimation issue under small test length; b) using pairwise preference scales can enlarge items, for instance, 20 and 50 statements can be reorganized to 190 and 1225 items, respectively; and c) each statement is characterized by one location parameter under the ZG models and which is easy to be calibrated with much less sample size due to few parameters. Two studies were conducted to examine the correspondence between SME and MML location estimates for unidimensional pairwise preference personality scales under non-adaptive and adaptive testing. Results illustrated that MML method has relative better RMSE than SME of all correlation setting. Moreover, the correlations between the respective trait scores were above .9, indicating that decisions making had to be made using either of two sets of scores of SME or MML are essentially the same.
Questions:
1) I was wondering how come SME method has underestimation on bias but results of MML method are all overestimated from table 5?
2) From table 2, experts have to rate the location for each statement in SME method. The correlation might be high between SME average rating and MML parameter estimate, it seems quite different just see from two number for each item. I guess that he reason is because rating interval is 1 without any decimal. Would it be possible to reduce difference by increasing decimal for rating by experts?