Volume 10 
August 2009 
Issue #3

Computer-Based Signing Accommodations: Comparing a Recorded Human with an Advatar by Michael Russell, Maureen Kavanaugh, Lynch School of Education, Boston College, Jessica Masters, Jennifer Higgins, Technology and Assessment Study Collaborative, Boston College, Thomas Hoffmann, Nimble Assessment Systems


Many students who are deaf or hard-of-hearing are eligible for a signing accommodation for state and other standardized tests. The signing accommodation, however, presents several challenges for testing programs that attempt to administer tests under standardized conditions. One potential solution for many of these challenges is the use of computer-based test delivery that integrates recordings of signed presentation of test content into the test. In addition to standardizing conditions, computer-based delivery holds potential to decrease the cost of developing recordings of signed presentation by using avatars rather than humans. However, because avatars are relatively new and are not as expressive or lifelike as humans, they may not be as affective as humans in presenting content in a clear and interpretable manner. The study presented here employed a randomized trial to compare the effect that a computer-based provision of the signed accommodation using a recorded human versus a signing avatar had on students' attitudes about performing a mathematics test and on their actual test performance. This study found that students generally reported that it was easy to perform a mathematics test on computer, and that both the recorded human and the signing avatar tools were easy to use and to understand. Students also reported a strong preference for performing future tests on computer, and generally preferred using the recorded human and the avatar for future tests rather than a DVD. While students also reported that they preferred the recorded human rather than the signing avatar, this preference did not affect test performance. The use of the recorded human and the avatar did not have affects on either the amount of time required to complete the test items or on students' performance on the test items. Implications for future research are discussed in light of these findings and the shortcomings of this study.

Volume 10

Special Issue

Issue#2, Article 1

For this article also download: Article 1, Table on page 6

No More Excuses: New Research on Assessing Students with Disabilities

by Stephen G. Sireci, University of Massachusetts Amherst


The articles in this special issue of the Journal of Applied Testing Technology represent significant steps forward in the area of evaluating the validity of methods for assessing the educational achievement of students with disabilities. The studies address some of the most difficult student groups to assess-students with learning disabilities, students with severe cognitive disabilities, deaf/hearing impaired students, students with disabilities who are also English Language Learners, and students who are likely to be inaccurately measured on statewide reading tests. The authors use a variety of research designs and statistical methods to provide evidence for evaluating the validity of assessments of these students. This article highlights the novel contributions of these studies and raises questions for readers to consider as they read each study.

Issue #2, Article 2

English Language Learners with Disabilities: Classification, Assessment, and Accommodation Issues by Jamal Abedi CRESST/University of California, Davis National Center for Research on Evaluation, Standards, and Student Testing


English language learners with disabilities (ELLWD) face many challenges in their academic career. Learning a new language and coping with their disabilities create obstacles in their academic progress. Variables relegating accessibility of assessments for students with disabilities and ELL students may seriously hinder the academic performance of ELLWD students. Furthermore, classification and accommodation for these students requires a more complex design than those for either ELLs or students with disabilities. Proper identification of these students is a challenge if their disability is masked by their limited English proficiency, or vice versa. Improper identification may lead to inappropriate instruction, assessment and accommodation for these students. Linguistic and cultural biases may affect the validity of assessment for ELLWD students. In this paper, issues concerning accessibility of assessment, classification, and accommodations for ELLWD students are discussed and recommendations for more accessible assessments for these students are provided.

Issue #2, Article 3

Identifying Less Accurately Measured Students by Ross Moen, Kristi Liu, Martha Thurlow, Adam Lekwa, Sarah Scullin, and Kristin Hausmann, University of Minnesota


Some students are less accurately measured by typical reading tests than other students. By asking teachers to identify students whose performance on state reading tests would likely underestimate their reading skills, this study sought to learn about characteristics of less accurately measured students while also evaluating how well teachers can make such judgments. Twenty students identified by eight teachers participated in structured interviews and completed brief assessments matched to characteristics their teachers said impeded the students' test performance. Researchers found information from evidence provided by teachers, teacher and student interviews, and student assessments that confirmed teacher judgments for some students and information that failed to confirm or was at odds with teacher judgments for other students. Along with observations about student characteristics that affect assessment accuracy, recommendations from the study include suggestions for working with teachers who are asked to make judgments about test accuracy and procedures for confirming teacher judgments.

Issue #2, Article 4

Using Factor Analysis to Investigate the Impact of Accommodations on the Scores of Students with Disabilities on a Reading Comprehension Assessment by Linda Cook,Daniel Eignor, Jonathan Steinberg, Yasuyo Sawaki, Frederick Cline, Educational Testing Service


The purpose of this study was to investigate the impact of a read-aloud test change administered with the Gates-MacGinitie Reading Test (GMRT) on the underlying constructs measured by the Comprehension subtest. The study evaluated the factor structures for the Level 4 Comprehension subtest given to a sample of New Jersey fourth-grade students with and without reading-based learning disabilities. Both exploratory and confirmatory factor analyses were used to determine whether or not the GMRT Comprehension subtest measures the same underlying constructs when administered with and without a read-aloud test change. The results of the analyses indicated factorial invariance held when the Comprehension subtest was administered to groups of students without disabilities who took the test under standard conditions and with a read-aloud test change and for groups of students with reading-based learning disabilities who also took the test under standard conditions and with a read-aloud test change.

Issue #2, Article 5

Examining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students by Jonathan Steinberg, Frederick Cline, Guangming Ling Linda Cook, Namrata Tognatta, Educational Testing Service


This study examines the appropriateness of a large-scale state standards-based English-Language Arts (ELA) assessment for students who are deaf or hard of hearing by comparing the internal test structures for these students to students without disabilities. The Grade 4 and 8 ELA assessments were analyzed via a series of parcel-level exploratory and confirmatory factor analyses, where both groups were further split based on English language learner (ELL) status. Differential item functioning (DIF) analyses were also conducted for these groups of students, and where sample sizes were sufficient, the groups were additionally split based on test accommodation status. Results showed similar factor structures across the groups of students studied and minimal DIF, which could be interpreted as lending support for aggregating scores for Annual Yearly Progress (AYP) purposes from students who are deaf or hard of hearing.

Issue #2, Article 6

Differential Item Functioning Comparisons on a Performance-Based Alternate Assessment for Students with Severe Cognitive Impairments, Autism and Orthopedic Impairments by Cara Cahalan Laitusis, Behroz Maneckshana, Lora Monfils, Educational Testing Service and Lynn Ahlgrim-Delzell, University of North Carolina at Charlotte


The purpose of this study was to examine Differential Item Functioning (DIF) by disability groups on an on-demand performance assessment for students with severe cognitive impairments. Researchers examined the presence of DIF for two comparisons. One comparison involved students with severe cognitive impairments who served as the reference group and students with autism and severe cognitive impairments who served as the focal group. The other comparison compared students with severe cognitive impairments (reference group) and students with severe cognitive impairments and orthopedic impairments (focal group). Results indicated a moderate amount of DIF for the autism comparison and a negligible amount of DIF for the orthopedic impairment comparison. In addition researchers coded all test items based on characteristics likely to favor one of the three groups. Although several of the hypothesized coding categories resulted in accurate prediction of DIF, the study was limited to items from one testing program for students in one state. More research is needed to see if these hypotheses can be replicated across testing programs and populations.

Issue #2, Article 7

Validity Evidence in Accommodations for English Language Learners and Students with Disabilities by Wayne Camara,The College Board


The five papers in this special issue of the Journal of Applied Testing Technology address fundamental issues of validity when tests are modified or accommodations are provided to English Language Learners (ELL) or students with disabilities. Three papers employed differential item functioning (DIF) and factor analysis and found the underlying constructs measured by tests do not change among these groups of students. Despite this strong finding, consistent and large score differences are present across groups. Such consistent and large score differentials among these groups on cognitive ability tests would be ideally contrasted with findings from alternative measures (e.g., portfolio's, performance assessments, and teachers' ratings). Two papers examine current methods used to identify and classify both ELL and students with disabilities, while other papers examine the performance of students with specific disabilities (e.g., deaf, mental retardation). The impact of modifications and accommodations on score comparability is discussed in relation to professional standards and current validity theory.

Volume 10 
March 2009 
Issue #1

Improving the Quality of Innovative Item Types: Four Tasks for Design and Development, Cynthia G. Parshall, Ph.D.,Measurement Consultant,J. Christine Harmes, Ph.D., James Madison University


Many exam programs have begun to include innovative item types in their operational assessments. While innovative item types appear to have great promise for expanding measurement, there can also be genuine challenges to their successful implementation. In this paper we present a set of four activities that can be beneficially incorporated into the design and development of innovative item types. These tasks are: template design, item writing guidelines, item writer training, and usability studies. When these four tasks are fully incorporated in the test development process then the potential for improved measurement through innovative item types is much greater.