Evaluating the Psychometric Qualities of Higher Education Instructor-constructed Multiple-Choice Tests: Impact on Test Quality and Consequences for Student Grades

Show simple item record

dc.contributor.advisor Brown, G en
dc.contributor.author Abdulnabi, Hasan en
dc.date.accessioned 2014-04-07T04:26:30Z en
dc.date.issued 2013 en
dc.identifier.uri http://hdl.handle.net/2292/21951 en
dc.description Full text is available to authenticated members of The University of Auckland only. en
dc.description.abstract Multiple-choice questions (MCQs) are commonly used in higher education assessment tasks because they can be easily and objectively scored, while also giving greater coverage of instructional content in a short time. However, studies that have evaluated the quality of MCQs used in higher education assessments have found many flawed items. These items can yield misleading feedback in regards to student performance and contaminate important decisions surrounding potential pass-fail results, graduation, employment, and teaching quality. Thus, MCQs in higher education need to be evaluated to ensure high quality inferences are made. Psychometric theory uses statistical models, based on either Classical Test Theory (CTT) or Item Response Theory (IRT), to determine the quality of items. The current study evaluated the quality of 100 instructor-written MCQs used in an undergraduate midterm test (50 items) and final exam (50 items), making up 50% of the course grade, using the responses of 380 students enrolled in one 1st-year undergraduate general education course. Item difficulty, discrimination, and chance properties were determined using four statistical models (i.e., CTT, IRT-1PL, IRT-2PL, & IRT- 3PL). For each model and test, the effect on individual student assessment (grades) and course grades was evaluated. Distractor analysis helped identify items that were psychometrically deficient. Since a MCQ is a type of selected-response item which could potentially be answered correctly by blindly guessing the correct option, it is essential to consider the item’s level of guessing when evaluating the quality of MCQs. Indeed, the IRT-3PL model is the only model used in this study that accounts for the pseudo-guessing parameter, besides the difficulty and discrimination parameters. In addition, the 3PL model kept most of the items and had the lowest relative standard error of measurement (SEM), which indicates the 3PL model had the most accurate scores that represent students’ real ability. These results identified the 3PL model as the most effective model for analysing MCQ tests and ensuring quality of grading decisions. Hence, it is strongly recommended that higher education institutions mandate and provide support for this type of analysis of all MCQ testing before student grading decisions are made. en
dc.publisher ResearchSpace@Auckland en
dc.relation.ispartof Masters Thesis - University of Auckland en
dc.rights Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher. en
dc.rights Restricted Item. Available to authenticated members of The University of Auckland. en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/nz/ en
dc.title Evaluating the Psychometric Qualities of Higher Education Instructor-constructed Multiple-Choice Tests: Impact on Test Quality and Consequences for Student Grades en
dc.type Thesis en
thesis.degree.grantor The University of Auckland en
thesis.degree.level Masters en
dc.rights.holder Copyright: The Author en
pubs.author-url http://hdl.handle.net/2292/21951 en
pubs.elements-id 431393 en
pubs.record-created-at-source-date 2014-04-07 en
dc.identifier.wikidata Q112899285


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics