News Release “Automated Essay Scoring Systems Demonstrated to be as Effective as Human Graders in Big Trial”
Vancouver, BC – A direct comparison between human graders and software designed to score student essays achieved virtually identical levels of accuracy, with the software in some cases proving to be more reliable, a groundbreaking study has found.
The study, which was underwritten by the William and Flora Hewlett Foundation and conducted by experts in educational measurement and assessment, will be released here on Monday, April 16th, at the annual conference of the National Council on Measurement in Education. An advance copy of the study is available today at http://bit.ly/HJWwdP.
“The demonstration showed conclusively that automated essay scoring systems are fast, accurate, and cost effective,” said Tom Vander Ark, CEO of Open Education Solutions, which provides consulting serves related to digital learning, and co-director of the study.
That’s important because writing essays are one important way for students to learn critical reasoning, but teachers don’t assign them often enough because grading them is both expensive and time consuming. Automated scoring of essays holds the promise of lowering the cost and time of having students write so they can do it more often.
Education experts believe that critical reasoning and writing are part of a suite of skills that students need to be competitive in the 21st century. Others are working collaboratively, communicating effectively and learning how to learn, as well as mastering core academic content. The Hewlett Foundation calls this suite of skills Deeper Learning and is making grants to encourage its adoption at schools throughout the country.
“Better tests support better learning,” says Barbara Chow, Education Program Director at the Hewlett Foundation. “This demonstration of rapid and accurate automated essay scoring will encourage states to include more writing in their state assessments. And, the more we can use essays to assess what students have learned, the greater the likelihood they’ll master important academic content, critical thinking, and effective communication.”
For more than 20 years, companies that provide automated essay scoring software have claimed that their systems can perform as effectively, more affordably and faster than other available methods of essay scoring. The study was the first comprehensive multi-vendor trial to test those claims. The study challenged nine companies that constitute more than ninety-seven percent of the current market of commercial providers of automated essay scoring to compare capabilities. More than 16,000 essays were released from six participating state departments of education, with each set of essays varying in length, type, and grading protocols. The essays were already hand scored according to state standards. The challenge was for companies to approximate established scores by using software.
At a time when the U.S. Department of Education is funding states to design and develop new forms of high-stakes testing, the study introduces important data. Many states are limited to multiple-choice formats, because more sophisticated measures of academic performance cost too much to grade and take too long to process. Forty-five states are already actively overhauling testing standards, and many are considering the use of machine scoring systems.
The study grows from a contest call the Automated Student Assessment Prize, or ASAP, which the Hewlett Foundation is sponsoring to evaluate the current state of automated testing and to encourage further developments in the field.
In addition to looking at commercial vendors, the contest is offering $100,000 in cash prizes in a competition open to anyone to develop new automated essay scoring techniques. The open competition is underway now and scheduled to close on April 30th. The pool of $100,000 will be awarded the best performers. Details of the public competition are available at www.kaggle.com/c/ASAP-AES. The open competition website includes an active leader board to document prize rules, regularly updated results, and discussion threads between competitors.
The goal of ASAP is to offer a series of impartial competitions in which a fair, open and transparent participation process will allow key participants in the world of education and testing to understand the value of automated student assessment technologies.
ASAP is being conducted with the support of the Partnership for Assessment of Readiness for College and Careers and Smarter Balanced Assessment Consortium, two multi-state consortia funded by the U.S. Department of Education to develop next-generation assessments. ASAP is aligned with the aspirations of the Common Core State Standards and seeks to accelerate assessment innovation to help more students graduate from college and to become career ready.
Jaison Morgan, CEO of The Common Pool, a consulting business that specializes in developing effective incentive models for solving problems, and co-director of the study, said the prize and studies will raise broader awareness of the current capabilities of automated scoring of essays.
“By offering a private demonstration of current capabilities, we can reveal to our state partners what is already commercially available,” Morgan said. “But, by complimenting it with a public competition, we will attract new participants to the field and investment from new players. We believe that the public competition will trigger major breakthroughs.”
ASAP is preparing to introduce a second study, in which private providers and public competitors will be challenged to reveal the capabilities of automated scoring systems for grading short-answer questions. The second study will be conducted this summer. There are another three ASAP studies in development.
Dr. Mark Shermis, Dean of the University of Akron, College of Education, a noted expert on automated scoring, is the ASAP principal investigator and lead Academic Advisor. He is author of Classroom Assessment in Action and co-editor of Automated Essay Scoring: A Cross-Disciplinary Approach.
ASAP was designed by The Common Pool, LLC, and is managed by Open Education Solutions, Inc.
Kaggle helps companies, governments, researchers, and other organizations identify solutions to some of the world’s hardest problems by posting them as competitions to a community of more than 33,000 PhD-level data scientists located around the world.
For more see:
- Less grading, more teaching, deeper learning
- Deeper learning, not lighter journalism
- Getting Ready for Online Assessment
- Hewlett Sponsored Assessment Prize Draws Amazing Talent
- How Intelligent Scoring Will Help Create an Intelligent System
- Education leaders urge assessment innovation, not super test