File(s) under permanent embargo
A collaborative comparison of objective structured clinical examination (OSCE) standard setting methods at Australian medical schools
journal contribution
posted on 2017-01-01, 00:00 authored by B S Malau-Aduli, P-A Teague, Karen D'SouzaKaren D'Souza, C Heal, R Turner, D L Garne, C van der VleutenBACKGROUND: A key issue underpinning the usefulness of the OSCE assessment to medical education is standard setting, but the majority of standard-setting methods remain challenging for performance assessment because they produce varying passing marks. Several studies have compared standard-setting methods; however, most of these studies are limited by their experimental scope, or use data on examinee performance at a single OSCE station or from a single medical school. This collaborative study between 10 Australian medical schools investigated the effect of standard-setting methods on OSCE cut scores and failure rates.
METHODS: This research used 5256 examinee scores from seven shared OSCE stations to calculate cut scores and failure rates using two different compromise standard-setting methods, namely the Borderline Regression and Cohen's methods.
RESULTS: The results of this study indicate that Cohen's method yields similar outcomes to the Borderline Regression method, particularly for large examinee cohort sizes. However, with lower examinee numbers on a station, the Borderline Regression method resulted in higher cut scores and larger difference margins in the failure rates.
CONCLUSION: Cohen's method yields similar outcomes as the Borderline Regression method and its application for benchmarking purposes and in resource-limited settings is justifiable, particularly with large examinee numbers.
METHODS: This research used 5256 examinee scores from seven shared OSCE stations to calculate cut scores and failure rates using two different compromise standard-setting methods, namely the Borderline Regression and Cohen's methods.
RESULTS: The results of this study indicate that Cohen's method yields similar outcomes to the Borderline Regression method, particularly for large examinee cohort sizes. However, with lower examinee numbers on a station, the Borderline Regression method resulted in higher cut scores and larger difference margins in the failure rates.
CONCLUSION: Cohen's method yields similar outcomes as the Borderline Regression method and its application for benchmarking purposes and in resource-limited settings is justifiable, particularly with large examinee numbers.