Artificial intelligence has become increasingly sophisticated and harder to detect, forcing educators to face a tough challenge differentiating between student-written work and AI-generated writing. To explore this issue in classrooms, we conducted a test to see if English teachers could distinguish between students’ written rhetorical analysis essays and generated writing.
We gave each teacher three rhetorical analysis essays, all based on the same 11th-grade level prompt. Unbeknownst to the teachers, two of the essays were high-scoring student essays, and one was completely AI-generated. The essays were anonymous and mixed together, making it a blind test. The teachers participating, Ms. Kamal, Ms. Ferrer, and Ms. Hudson, were given the three essays along with the prompt and AP Language rubric. They were asked to grade the essays, give predictions on which ones were generated, and share their comments and insights.
Overall, a majority of teachers were able to detect which essay was completely generated, reflected by the overtly perfect mechanics of the third essay. Ms. Ferrer elaborated on this, stating that the third essay was the strongest one mechanically because of “sentence cohesion, the grammar, and the punctuation.” Paired with performance assessments, purely generative content can be easy to spot. Ms. Hudson explains that tools like i-Ready can “help [her] understand what grade level a student is at, and if their writing doesn’t match the assessment, it’s going to raise alarm bells.”
Addressing the broader implications of AI in the classroom, Ms. Ferrer expressed her concern about a widening gap between students’ writing and comprehension abilities compared to the expected academic standard. She noted an increase in “ students who cannot comprehend prompts and therefore struggle to plan a sufficient response,” as well as students who cannot complete a 5-paragraph, high-quality essay in a 50-minute period.” Over the course of this year, these challenges have appeared to afflict a majority of students in her AP Language and Composition class. She attributed much of this decline to students’ growing reliance on AI across subjects and emphasized the overarching impacts. “It’s showing up in their struggles to articulate themselves. So, it’s just coming up as lower literacy overall.” Her own students have pointed out this observation as reading comprehension and cognition link with verbal fluency.
Despite these concerns, both Ms. Kamal and Ms. Ferrer both expressed a positive sentiment on AI studying tools. When students independently complete foundational work, only using AI to supplement, it becomes most effective. “If a student solely relies on AI, it’s not helpful.” Ms.Kamal added that students ultimately “bring to AI what kind of levels you already are,” meaning that already strong and diligent students tend to benefit far more from AI tools, while these same tools only prove to be detrimental for weaker students using them as a crutch.
Despite a majority of the teachers being able to identify the AI-generated essays, Ms. Ferrer did note that she “didn’t think the AI helped them very much, because my score was essentially the same for all three.”
