May 6, 2013
Computers have certainly revolutionized the field of education, but can they really grade papers as well as teachers? A new paper from a teaching advocacy group says no -- despite research to the contrary.
A new paper from the National Council of Teachers of English called "Machine Scoring Fails the Test" criticizes the idea that machines can score writing assessments as well as humans can. According to the position statement, writing is a "highly complex ability" developed over years of practice, and grading it requires a human perspective. Machines fail to recognize such elements as humor, accuracy or clarity, preferring instead "different, cruder methods," such as average length of words or sentences, to score essays. The group says this process "denies students the chance to have anything but limited features recognized in their writing."
"The very qualities that we associate most strongly with good writing are qualities that it's extremely different for a computer to recognize," Chris Anson, chair of NCTE's Conference on College Composition and Communication, told Inside Higher Ed. "The computers can't make inferences. They can't understand what they're reading; they can only look for specific features they've been programmed to look at, and those are mostly surface kind of features."
According to The Chronicle of Higher Education, the NCTE 's executive committee unanimously passed the new position paper last month. Chief among the group's concerns, said Anson, is that machine scoring will encourage writing that is superficially competent but that lacks context or meaning. Students could "game" the system, submitting below-par writing that still meets the computer's criteria. Anson said he also worries that machine grading sends students a message that "writing is so unimportant that we're not willing to take the time to read it."
Inside Higher Ed reports that the University of Akron's Mark Shermis objects to the paper, noting that it fails to make the distinction between scoring used for "summative assessment" and that used to provide "feedback in the instruction of writing." Shermis was lead author of a 2012 study that suggested that computers can grade essays just as well as people, often meeting and sometimes exceeding the metrics "commonly used to evaluate human raters in a high-stakes testing environment." He said he also objects to NCTE's suggestion that computer grading is easy to fool.
"[O]ne has to be a good writer to construct the 'bad' essay that gets a good score," Shermis told Inside Higher Ed. "A Ph.D. from MIT can do it, but a typical 8th grader cannot."
The Chronicle of Higher Education reports that the NCTE's position paper is likely to be well received among standardized testing critics, who often argue that such written assessments short-circuit the writing process. While most computer scoring will be used for elementary and secondary school assessments, grading of the written portion of the Collegiate Learning Assessment is almost completely automated. Shermis contends that the technology will continue to be developed and its applications broadened.
"There's nothing we can do to stop it," Shermis told The Chronicle of Higher Education. "It's whether we can shape it or not."
Compiled by Aimee Hosler
"Does Not Compute," insidehighered.com, May 3, 2013, Zack Budryk
"English Teachers Reject Use of Robots to Grade Student Writing," chronicle.com, May 3, 2013, Dan Barrett
"Machine Scoring Fails the Test," NCTE Position Statement on Machine Scoring, ncte.org, April 2013