|









 |
|
Volume 9 |
April 2008 |
Issue#3 |
|
Examining Panelist Data from a Bilingual Standard
Setting Study.
Elaine M. Rodeck,
Tzu-Yun Chin, Susan L. Davis,
Barbara S. Plake,
Buros Center for Testing,
University of Nebraska-Lincoln
Abstract
This study examined
the relationships between the evaluations obtained from standard
setting panelists and changes in ratings between different
rounds of a standard setting study that involved setting
standards on different language versions of an exam We
investigated panelists’ evaluations to determine if their
perceptions of the standard setting were related to adjustments
they made in their recommended cut scores across rounds of the
process. The standard setting was conducted for a high school
mathematics test composed of multiple-choice and constructed
response items. The test was designed for a population of
students who speak and receive primary instruction in either
English or French. Results indicated panelists’ ratings of their
ratings and their comfort with the process were related to how
their ratings changed across sequential rounds of the process.
Differences in the degree to which
the evaluations influenced the standard setting judgments were
observed across the English and French panelists, with the
French group reporting increasing comfort across rounds in
contrast to the English group that had relatively higher comfort
at the beginning of the process.
The results illustrate how standard
setting evaluation data can provide insight into factors that
affect panelists’ ratings.
|
| Volume 9 |
February 2008 |
Issue#2 |
A
Non-Technical Approach for Illustrating Item Response Theory.
Chong
Ho Yu, Angel Jannasch-Pennell, & Samuel DiGangi, Arizona State
University
Abstract
Since
the introduction of the No Child Left Behind Act, assessment has
become a pre-dominant theme in the US K-12 system. However, making
assessment results understandable and usable for the K-12 teachers
has been a challenge. While test technology offered by various
vendors has been widely implemented, technology of training for
test development seems to be under-developed. The objective of
this presentation is to illustrate a well-designed interactive
tutorial for understanding the complex concepts of Item Response
Theory (IRT). The approach of this tutorial is to dissociate IRT
from Classical Test Theory (CTT) because it is the belief of the
authors that the mis-analogy between IRT and CTT could lead to
misconceptions. Initial user feedback is collected as input for
further refining the program. |
| Volume 9 |
February 2008 |
Issue#1 |
Matching
the Judgmental Task with Standard Setting Panelist Expertise:
The Item-Descriptor (ID) Matching Method. Steve
Ferrara, CTB/McGraw-Hill; Marianne
Perie, National Center for the Improvement of Educational Assessment;
and Eugene Johnson, Independent Consultant
Abstract
Psychometricians
continue to introduce new approaches to setting cut scores for
educational assessments in an attempt to improve on current methods.
In this paper we describe the Item-Descriptor (ID) Matching method,
a method based on IRT item mapping. In ID Matching, test content
area experts match items (i.e., their judgments about the knowledge
and skills required to respond to an item) to the knowledge and
skills described in performance level descriptors that are used
for reporting test results. We argue that the cognitive-judgmental
task of matching item response requirements to performance level
descriptors is aligned closely with the experience and expertise
of standard setting panelists, who are typically classroom teachers
and other content area experts. Unlike other popular standard
setting methods, ID Matching does not require panelists to make
error-prone probability judgments, predict student performance,
or imagine examinees who are just barely in a performance level.
We describe applications of ID Matching in two educational testing
programs and provide evidence of the effectiveness of this method.
The entire process is described in the first section of the paper.
Subsequent sections describe applications of ID Matching for two
operational testing programs. |
|