It is widely believed that human rating performance is influenced by an array of
different factors. Among these, rater-related variables such as experience, language
background, perceptions, and attitudes have been mentioned. One of the important
rater-related factors is the way the raters interact with the rating scales. In particular,
how raters perceive the components of the scales to further plan their scoring seems
important. For this aim, the present study investigated the raters’ perceptions of the
rating scales and their subsequent rating behaviors for two analytic and holistic rating
scales. Hence, nine highly experienced raters were asked to verbalize their thoughts
while rating student essays using IELTS holistic scale and the analytic scale of ESL Composition
Profile. Upon analyzing the think-aloud protocols, four themes emerged. The
findings showed that when rating holistically, the raters either referred to the holistic
scale components to validate their ratings (validation) or had a pre-evaluation reading
to rate in a more reliable way (dominancy). In analytic rating, on the other hand,
the raters used a pre-evaluation scale reading in order to keep the components and
their criteria to memory to evaluate the text more accurately (dominancy) or regularly
moved between the text and the scale components to assign a score (oscillation).
Furthermore, the results of a Wilcoxon signed-rank test showed that when using the
holistic and analytic rating scales, the raters assigned significantly different scores to
the texts. On the whole, the results revealed that the way the raters perceived the scale
components will affect their judgement of the texts. The study also provides several
implications for rater training programs and EFL writing assessment.