Go figure, one MIT prof thinks he can game one engine and the New York Times and NPR do stories on his anecdotal claims and miss the evidence of many large scientific studies including the study released this month.
Students coming up the writing learning curve don’t know how to put together the right words in good coherent essays. Automated scoring systems help them do that. Highly educated people can take a good essay and make it worse by changing a few words. However, it doesn’t hide the fact that the person still knows the material. It is much harder for a student to take their poor essay and make it look good. The best way to do that is to learn the material. The best way to learn to write is to practice and automated essay scoring helps teachers demand more writing.
Automated scoring engines are widely deployed in professional certification, SAT testing, state testing programs, and used in thousands of classrooms. The NYTimes article glosses over the facts that these systems are operational and there are plenty of mechanisms for human oversight including having second graders, backreads and teachers reading the essays. The concerns raised may be the same concerns with students writing essays for humans.
This year I’ve had the chance to speak with executives at nine assessment organizations on a weekly basis. Peter Foltz, Pearson Knowledge Technologies, noted this week that, “There are also ways to have the computer detect a number of problem of people trying to game the system. We have mechanisms that detect plagiarism, unusually creative, off-topic, larding large words, unusual wording. So many of these issues are already addressed in existing systems.”
Pearson will score about 10 million constructed responses this year and currently have about 24,000 teachers using our software in classrooms. Foltz said, “We don’t have teachers complaining about students gaming the system. The teachers are still reading the student essays, but they don’t have to monitor every draft. Instead, we hear about students learning to write and think.” Here are a few examples uses of automated essay scoring:
- Starting in 2010, South Dakota replaced their 45 minute paper and pencil summative writing assessment with WriteToLearn used as a formative assessment. All students in 5th, 7th and 10th grade are required to do writing using the software at least three times during the school year. Teachers have the ability to decide what prompts will be assigned to the students and how the writing is incorporated into their lesson plans. In a study conducted last year and presented at the National Council for Measurement in Education conference in 2011 the data from 255,000 student submissions was presented. Students wrote an average 3.5 drafts on each prompt. With five revisions, student scores increased by almost one point (on a six point scale). Teachers receive immediate reports on the individual students as well as the class and can use the results to work with students on areas needing improvement. With the old paper summative test, it took 3-6 weeks for teachers to receive any feedback. (Foltz, P. W., Lochbaum, K. E., & Rosenstein, M. B. (2011). Analysis of student writing for a large scale implementation of formative assessment.)
- The East Palo Alto Tennis and Tutoring used WriteToLearn for an afterschoool tutoring program for struggling students. One focus was teaching summarization skills where students read a text and then had to write a summary that was automatically scored for the quality of the content. Kesha Weekes, EPATT academic director, said that previously students would focus on the mechanics of writing – looking for mistakes in spelling, punctuation and subject-verb agreement- but they didn’t know how to focus on writing for show they comprehend the content. “WriteToLearn shows them where they’ve omitted an area of content, so they’re taking a more critical look at their writing”. “They would never have done that before”. (Case study at: http://www.writetolearn.net/CaseStudies/EPATT-WriteToLearn-082007.pdf)
- WriteToLearn is being used at Cedar Shoals High School in Athens, Georgia as a way of helping ELL student master the English language. Students get feedback in six writing traits, ideas, organization, conventions, sentence fluency, word choice and voice, and can focus on several writing concepts at one time, as opposed to learning one concept now and then another later. For ELL writers who to come to writing in English with highly varied skills, it provides a more personalized way of learning where their strengths lie and how to improve in the other areas. (Case study at: http://www.writetolearn.net/CaseStudies/WTL-CedarShoalsCaseStudy-061209.pdf)
The Hewlett Foundation funded Automated Student Assessment Prize (ASAP) was designed to create evidence that supports the use of automated essay scoring to affordably incorporate more writing on their annual tests and avoid tests limited to multiple choice items. There will be additional field trials, but ASAP has provided sufficient evidence to support widespread use of automated scoring. States will be encouraged to be thoughtful about ways the combine expert grading and automated scoring in to support high stakes decision making.
A foundational benefit of the shift to personal digital learning that will occur in this decade worldwide is formative assessment that runs in the background all day long providing periodic structured feedback to every student. It will enable customized learning pathways. It will empower better teaching. It will boost motivation, persistence, and the quality of student work products. It will extend achievement and degree completion. That’s the real story.
Disclosures: ASAP is a project of Open Education Solutions where Tom is CEO. Pearson is a limited partner in Learn Capital where Tom is a partner