Two-tier multiple-choice (TTMC) items are used to assess students’ knowledge of a scientific concept for tier 1 and their reasoning about this concept for tier 2. But are the knowledge and reasoning involved in these tiers really distinguishable? Are the tiers equally challenging for students? The answers to these questions influence how we use and interpret TTMC instruments. We apply the Rasch measurement model on TTMC items to see if the items are distinguishable according to different traits (represented by the tier), or according to different content sub-topics within the instrument, or to both content and tier. Two TTMC data sets are analyzed: data from Singapore and Korea on the Light Propagation Diagnostic Instrument (LPDI), data from the United States on the Classroom Test of Scientific Reasoning (CTSR). Findings for LPDI show that tier-2 reasoning items are more difficult than tier-1 knowledge items, across content sub-topics. Findings for CTSR do not show a consistent pattern by tier or by content sub-topic. We conclude that TTMC items cannot be assumed to have a consistent pattern of difficulty by tier—and that assessment developers and users need to consider how the tiers operate when administering TTMC items and interpreting results. Researchers must check the tiers’ difficulties empirically during validation and use. Though findings from data in Asian contexts were more consistent, further study is needed to rule out differences between the LPDI and CTSR instruments.
Bibliographical noteVersion archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
- science education
- two-tier items
- rasch measurement models
- scientific reasoning