Artificial intelligence can help instructors write better feedback on student essays and improve learning outcomes when AI is used as a behind-the-scenes assistant rather than a replacement for human graders, a new University of Michigan Engineering study suggests.
The researchers developed an AI-mediated system called FeedbackWriter, which offers university teaching assistants suggestions aligned with assignment expectations while they read student essays, giving TAs the final say on what to use, edit or discard.
“Feedback is one of the most powerful mechanisms for learning, but it takes time and effort to provide personalized feedback to each student,” said Xu Wang, U-M assistant professor of computer science and engineering and corresponding author of the study presented today at the Association for Computing Machinery’s Conference on Human Factors in Computing Systems.
“Our goal was to understand whether AI could help people provide high-quality feedback at scale while keeping humans in control,” she said.
The work was supported by the National Science Foundation and conducted in collaboration with Mitchell Dudley, a teaching professor of economics, and Larissa Sano, lecturer and science writing specialist in the U-M Sweetland Center for Writing.
“This project shows how AI can support instructors by giving rubric-relevant suggestions that they can use to generate high-quality feedback for each student,” Dudley said. “This not only improves instructor workflow, but also the quality of the feedback given. This is especially valuable in large-enrollment courses, where writing-to-learn is powerful but quality feedback is hard to scale.”
AI that understands how graders think
Prior research has explored AI-generated feedback delivered directly to students, but studies have found AI to be unreliable, particularly on assignments that require knowledge of the field and conceptual accuracy. In interviews, experienced TAs described their workflow as going beyond simply spotting errors; they must connect comments to a grading rubric, find evidence in students’ essays and craft feedback that is actionable without giving away the answer.
To support that work, the research team first engaged in a deliberate process to extract a knowledge checklist, or rubric, that defines what good and bad solutions look like. FeedbackWriter then uses this rubric to guide the AI through a structured pipeline: It identifies passages relevant to each rubric item, decides whether the rubric item has been met and drafts suggested feedback.
The interface is designed to keep the human grader in charge. TAs can accept or reject AI judgments, edit the suggested feedback or write their own from scratch.
“We’re not trying to replace human instructors,” said Xinyi Lu, doctoral student in computer science and engineering and first author of the study. “We’re trying to build a collaboration, where AI helps with the parts that are difficult to do consistently at scale, and the TA provides final decision-making.”
Tested in a real university course
The researchers evaluated FeedbackWriter in a randomized controlled study in an introductory economics course with 354 students and 11 TAs, using two knowledge-intensive essay assignments. Students wrote a first draft, received either AI-mediated feedback through FeedbackWriter or human-only feedback, then revised and submitted a final draft. In a second assignment, the groups switched, so students received both types of feedback across the two assignments.
When students received AI-mediated feedback, they produced significantly higher-quality revisions in comparison to when they received human-only feedback. The effect size is roughly equivalent to a student moving from the 50th to the 70th percentile.
The team also evaluated feedback quality using criteria derived from learning sciences research, including whether AI-generated comments promoted independent learning by guiding students with hints instead of simply providing answers. AI-mediated feedback outperformed human-only feedback across all measures and covered more rubric items.
TAs generally found the AI suggestions accurate but still made corrections. In the study, they agreed with about 88% of the AI’s judgments, making edits to the remaining 12%. In interviews, TAs said Feedback Writer helped them be more systematic in applying the rubric and, unexpectedly, helped them better understand the rubric themselves by highlighting relevant examples.
A tool to reallocate time, not people
The researchers emphasized that AI-mediated feedback is designed to support human instruction, not replace it. The team is now exploring whether AI support on routine tasks can enable TAs to spend more time on direct student support, such as office hours and one-on-one help, while still keeping humans responsible for final evaluations.
“Our results show that when AI is designed to complement human expertise, and when people can verify and correct mistakes, it can help students learn more effectively,” Wang said.

