Wednesday, September 23, 2009

Reaching for Precision in the Imperfect World of Multiple Choice Items – It Begins with Item Specification Right from the Start

Multiple-choice questions are the most widely used approach to assessing student mastery of standards. In fact, they represent the largest proportion of item types on state-wide tests measuring standards mastery, on national tests measuring educational progress, on college entrance exams, and on benchmark and formative assessments utilized as part of local school district assessment programs. The information gleaned on student learning from this broad array of assessments is used by a diversity of educational decision-makers to accomplish a wide range of educational goals.

These goals may include: 1) re-teaching and enrichment intervention for a student, group, or class; 2) school- or district-wide proactive educational planning to help promote student mastery of standards throughout the school year; and 3) comprehensive strategic planning initiatives that may substantially alter the kinds of programs, policies, and practices implemented within a school district or throughout an entire state.

Given the extent to which assessment data can and does drive educational decision-making, it is imperative that the construction, review, certification, and empirical validation of multiple-choice items included on these assessments be very, very precise.

A typical multiple-choice item has three parts. These include a stem that presents a problem, the correct answer; and several distractors. These items can be constructed to assess a variety of learning outcomes, from simple recall of facts to highly complex cognitive skills.

Regardless of learning outcome being assessed, precision in the item writing process begins with precision in designing an item in such a way so as to ensure that the focus of the item is on the specification (i.e., the construct) being measured. In the ideal world of assessment students who correctly answer an item built in this fashion are assumed to do so because they have mastered the principle or construct being assessed. Of course, as we all know, the real world of assessment is an imperfect one, and one in which measurement is not always precise. Measurement error and guessing, for example are a permanent part of that real world.

That being said, we can take considerable steps right from the start in fostering a high level of precision in item development even in this imperfect world of assessment. For example, when new items are to be added to the ATI item banks for use in Galileo K-12, the first step is to review the standard which is to be assessed. The standard is broken down into the skills that make up the standard. These skills are the starting point for developing an online list of item specifications defining the characteristics of the particular class of items to be written.

Item specifications indicate the defining characteristics of the item class, the rationale for the class, and the required characteristics for each item component. In the case of multiple-choice items, the required characteristics of the stem and the alternatives are specified. Specifications address such factors as the cognitive complexity intended for items included in the specification, the appropriateness of vocabulary, requirements related to readability levels, and the alignment of the item with standards. The value of creating specifications as a guide for the item development process is recognized as a critical part of a process documenting that assessments are reliable and valid indicators of the ability they are intended to measure (Haladyna, 2004). Their structure and specificity also afford many advantages for ensuring that assessments may be readily adapted as district needs and or state/federal requirements change.

Extensive information about the ATI item specification process as well as the multi-stage item construction, review and certification procedures use by ATI can be accessed by contacting us directly.

We would, of course, like to hear from you as well. For example, what kinds of challenges have you faced in developing items within your district or for your classroom? And what kinds of solutions/procedures have you implemented to help enhance the precision with which locally developed test items are developed, reviewed and empirically validated?

Reference: Haladyna, T.M. (2004). Developing and Validating Multiple-Choice Test Items (3rd ed.). Mahwah, N.J.: Lawrence Erlbaum Associates.

No comments: