Jun

Using Item Response Time in Scoring: Guidelines for Psychometric Quality and Fairness

This blog article by Dylan Molenaar and Deborah Schnipke, is part of a series supporting the ITC/ATP Guidelines for Technology-based Assessment.

Testing time can be a valuable factor in evaluating the abilities and skills of test takers.

Response times can be utilized as part of the scoring process in tests where timing is crucial for performance. However, incorporating response time into scoring introduces additional challenges compared to scoring based solely on the responses.

In the recent ITC/ATP Guidelines for Technology-Based Assessment, our chapter on test scoring provides some recommendations for using response times in scoring a test. It is important to consider psychometric quality and fairness to avoid penalizing test takers who may be negatively affected by factors that are not relevant to the test’s construct. In this blog article, we offer a preview of our guidelines using an example from a test of children’s arithmetic ability.

An example in the Netherlands

In the Netherlands, Math Garden is an online monitoring tool used in some primary schools to track children’s arithmetic ability. In Math Garden, children complete multiple-choice arithmetic items in a game-like environment. To motivate the children, response times are incorporated into scoring. Specifically, Math Garden employs a scoring rule called the “high speed – high gain” rule developed by Maris and Van der Maas (2012). In essence, correct answers earn points, while incorrect answers result in a loss of points. However, the number of points gained or lost depends on the response time for each item. For correct responses, fewer points are obtained if the response is made closer to the item time limit, whereas for incorrect responses, fewer points are deducted if the response is made closer to the deadline. This specific scoring rule aims to encourage fast but accurate responding, while discouraging hasty guessing.

Challenges and considerations

Clear communication. The main challenge in a testing situation like this is to provide clear information to test takers and those interpreting the test results on how the response times are used in scoring. In the Math Garden example, it is important for the children to understand that a quick guess will be penalized more than an informed (slow) guess. Since scoring rules involving response times are often more complex compared to rules based solely on response accuracy (e.g., right/wrong) the researcher or test developer needs to make an effort to effectively communicate the scoring rule. In Math Garden, the scoring rule is intuitively conveyed by displaying a set of coins. Each second, a coin disappears. After the response, the remaining coins are added to the test taker’s score if the response is correct or deducted if the response is incorrect.

Accurate response times. Another challenge addressed in our guidelines chapter is the accurate recording of response times when they are considered in scoring. In the Math Garden example, the test is not high-stakes; however, in high-stakes tests, it would pose a serious problem if response times were unevenly affected by factors such as internet lag. More generally, test validity is compromised when construct-irrelevant factors, such as motor disabilities, executive function challenges, testing in a non-dominant language, or other personal characteristics, have a negative impact on response times.

Fit to measurement model. Lastly, before incorporating the response times into scoring, it is crucial to establish an acceptable fit of the underlying measurement model, including the suitability of the fit of the model across different relevant groups (such as boys and girls in the Math Garden test). For example, in the case of the “high speed – high gain” scoring rules used in Math Garden, the underlying measurement model assumes the responses and response times are independent for test takers with the same ability (local independence) and across groups. It is important for this assumption to hold to avoid confounding factors in inferences based on the scoring rule.

Conclusion

Using response time in scoring is an effective means of measuring processing speed when it is relevant to the construct being assessed. However, it is essential to adhere to guidelines for psychometric quality and fairness to avoid penalizing test takers due to factors that are irrelevant to the construct. By following these guidelines, we can ensure that response time in scoring is used in a fair and accurate manner.

The Guidelines may be downloaded at no charge from ATP’s website: www.testpublishers.org/atp-white-papers

References

Maris, G., & Van der Maas, H. (2012). Speed-accuracy response models: Scoring rules based on response time and accuracy. Psychometrika, 77(4), 615-633.

Using item response time in scoring

Using Item Response Time in Scoring: Guidelines for Psychometric Quality and Fairness

Comments on "Using item response time in scoring"

Comments 0-5 of 0

Quick Links

Association of Test Publishers