The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education

Research Report

The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education

Paiheng Xu, Jing Liu, Nathan Jones, Julie Cohen, and Wei Ai

April 2024 | Annenberg Institute

This research report from Brown University’s Annenberg Institute investigates the potential and limitations of using language models to assess the quality of teaching. The authors highlight the drawbacks of manual assessment systems, including their high cost and subjectivity. For this reason, they explore an alternative—Natural Language Processing (NLP) techniques—to provide more timely and frequent feedback to educators. The study analyzes in-person K–12 classroom settings and simulated performance tasks for pre-service teachers. It is also the first study that applies NLP to effective practices for students with special needs. The overall results of the study suggest that pretrained language models (PLMs) demonstrate performance similar to human raters, but only for variables that require less inference.