Diagnostic writing assessment: the development and validation of a rating scale

Knoch, Ute

Diagnostic writing assessment: the development and validation of a rating scale

Knoch, Ute

Identifier: http://hdl.handle.net/2292/2343

Issue Date: 2007

Reference: Thesis (PhD--Department of Applied Language Studies and Linguistics)--University of Auckland, 2007.

Degree Name: PhD

Degree Grantor: The University of Auckland

Rights: Copyright: The author

Rights (URI): https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm

Abstract:

Alderson (2005) suggests that diagnostic tests should identify strengths and weaknesses in learners' use of language, focus on specific elements rather than global abilities and provide detailed feedback to stakeholders. However, rating scales used in performance assessment have been repeatedly criticized for being imprecise, for using impressionistic terminology (Fulcher, 2003; Upshur & Turner, 1999; Mickan, 2003) and for often resulting in holistic assessments (Weigle, 2002). The aim of this study was to develop a theoretically-based and empirically-developed rating scale and to evaluate whether such a scale functions more reliably and validly in a diagnostic writing context than a pre-existing scale with less specific descriptors of the kind usually used in proficiency tests. The existing scale is used in the Diagnostic English Language Needs Assessment (DELNA) administered to first-year students at the University of Auckland. The study was undertaken in two phases. During Phase 1, 601 writing scripts were subjected to a detailed analysis using discourse analytic measures. The results of this analysis were used as the basis for the development of the new rating scale. Phase 2 involved the validation of this empirically-developed scale. For this, ten trained raters applied both sets of descriptors to the rating of 100 DELNA writing scripts. A quantitative comparison of rater behavior was undertaken using FACETS (a multi-faceted Rasch measurement program). Questionnaires and interviews were also administered to elicit the raters' perceptions of the efficacy of the two scales. The results indicate that rater reliability and candidate discrimination were generally higher and that raters were able to better distinguish between different aspects of writing ability when the more detailed, empirically-developed descriptors were used. The interviews and questionnaires showed that most raters preferred using the empirically-developed descriptors because they provided more guidance in the rating process. The findings are discussed in terms of their implications for rater training and rating scale development, as well as score reporting in the context of diagnostic assessment.

Description:

Restricted Item. Print thesis available in the University of Auckland Library or may be available through Interlibrary Loan.

Show full item record

Files in this item

This item appears in the following Collection(s)

Doctoral Theses - Restricted Access [119]

Diagnostic writing assessment: the development and validation of a rating scale

Diagnostic writing assessment: the development and validation of a rating scale

Abstract:

Description:

Files in this item

This item appears in the following Collection(s)

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics

Diagnostic writing assessment: the development and validation of a rating scale

Diagnostic writing assessment: the development and validation of a rating scale

Abstract:

Description:

Files in this item

This item appears in the following Collection(s)

Share

Search ResearchSpace

Browse

All of ResearchSpace

This Collection

Statistics