The Angoff procedure is one of the widely used standard-setting methods, and it has several derivatives. In the original method, recruited panelists are asked to provide their judgments with respect to the probability of whether the imagined minimally competent examinee can answer each item correctly (or more precisely, the expected score), and then these judgments are summed as the cut-score on the number-correct score scale for each panelist. Later the average or median over panelists is the final group cut-score.
The article discussed a question easily to be overlooked: what extent of precision is suitable? Through a series of simulations, three rounding judgments including were investigated, including the nearest whole number, nearest 0.05, and nearest two decimal places, respectively. The results suggested that rounding to nearest whole number had the potential to produce large biases in cut-score estimates.
Although it was mentioned on page 232, “Finally, the Newton-Raphson procedure was used to compute the estimated cut-score in the theta metric for each panelist……,” does it suggest that the correspondence between summed judgments and theta was established by the expected score curve?