Evaluating the Psychometric Qualities of Higher Education Instructor-constructed Multiple-Choice Tests: Impact on Test Quality and Consequences for Student Grades

Abdulnabi, Hasan

dc.contributor.advisor	Brown, G	en
dc.contributor.author	Abdulnabi, Hasan	en
dc.date.accessioned	2014-04-07T04:26:30Z	en
dc.date.issued	2013	en
dc.identifier.uri	http://hdl.handle.net/2292/21951	en
dc.description	Full text is available to authenticated members of The University of Auckland only.	en
dc.description.abstract	Multiple-choice questions (MCQs) are commonly used in higher education assessment tasks because they can be easily and objectively scored, while also giving greater coverage of instructional content in a short time. However, studies that have evaluated the quality of MCQs used in higher education assessments have found many flawed items. These items can yield misleading feedback in regards to student performance and contaminate important decisions surrounding potential pass-fail results, graduation, employment, and teaching quality. Thus, MCQs in higher education need to be evaluated to ensure high quality inferences are made. Psychometric theory uses statistical models, based on either Classical Test Theory (CTT) or Item Response Theory (IRT), to determine the quality of items. The current study evaluated the quality of 100 instructor-written MCQs used in an undergraduate midterm test (50 items) and final exam (50 items), making up 50% of the course grade, using the responses of 380 students enrolled in one 1st-year undergraduate general education course. Item difficulty, discrimination, and chance properties were determined using four statistical models (i.e., CTT, IRT-1PL, IRT-2PL, & IRT- 3PL). For each model and test, the effect on individual student assessment (grades) and course grades was evaluated. Distractor analysis helped identify items that were psychometrically deficient. Since a MCQ is a type of selected-response item which could potentially be answered correctly by blindly guessing the correct option, it is essential to consider the item’s level of guessing when evaluating the quality of MCQs. Indeed, the IRT-3PL model is the only model used in this study that accounts for the pseudo-guessing parameter, besides the difficulty and discrimination parameters. In addition, the 3PL model kept most of the items and had the lowest relative standard error of measurement (SEM), which indicates the 3PL model had the most accurate scores that represent students’ real ability. These results identified the 3PL model as the most effective model for analysing MCQ tests and ensuring quality of grading decisions. Hence, it is strongly recommended that higher education institutions mandate and provide support for this type of analysis of all MCQ testing before student grading decisions are made.	en
dc.publisher	ResearchSpace@Auckland	en
dc.relation.ispartof	Masters Thesis - University of Auckland	en
dc.rights	Items in ResearchSpace are protected by copyright, with all rights reserved, unless otherwise indicated. Previously published items are made available in accordance with the copyright policy of the publisher.	en
dc.rights	Restricted Item. Available to authenticated members of The University of Auckland.	en
dc.rights.uri	https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/nz/	en
dc.title	Evaluating the Psychometric Qualities of Higher Education Instructor-constructed Multiple-Choice Tests: Impact on Test Quality and Consequences for Student Grades	en
dc.type	Thesis	en
thesis.degree.grantor	The University of Auckland	en
thesis.degree.level	Masters	en
dc.rights.holder	Copyright: The Author	en
pubs.author-url	http://hdl.handle.net/2292/21951	en
pubs.elements-id	431393	en
pubs.record-created-at-source-date	2014-04-07	en
dc.identifier.wikidata	Q112899285