ETS Advice on Automated Scoring

ETS assessment scientist Randy Bennett recently released advice to states on automated scoring. Race to the Top funding of state assessment consortia triggered Dr. Bennett’s paper:

The Race to theTop assessment consortia have indicated an interest in using “automated scoring” to more efficiently grade student answers to constructed-response literacy and mathematics tasks.Automated scoring refers to a large collection of grading approaches that differ dramatically depending upon the constructed-response task being posed and the expected answer. Even within a content domain, automated scoring approaches may differ significantly, such that a single characterization of the “state of the art” for that domain would be misleading. This white paper identifies potential uses and challenges around automated scoring to help the consortia make better- informed planning and implementation decisions. If the benefits sought from constructed-response tasks are to be realized, automated scoring must be implemented and used thought- fully; otherwise, automated scoring may have negative effects strikingly similar to those commonly attributed to multiple-choice tests.

One important statement in the paper frames current capabilities, “A machine does not read, understand, or grade a student’s essay in the same way as we like to believe that a human rater would. It simply predicts the human rater’s score.” That is true of current scoring engines, but there is clearly potential for intelligent scoring engines and innovative items that directly measure knowledge and skill and don’t simply seek to predict human scoring. An equally important development, based on a dramatic increase in assessment output and advances in data mining, will be new strategies for comparing big data sets.
Dr. Bennett’s paper, Automated Scoring of Constructed-Response Literacy and Mathematics Items, provides seven recommendations:
1. Design a computer-based assessment as an integrated system in which automated scoring is one in a series of interre- lated parts
2. Encourage vendors to base the development of automated scoring approaches on construct understanding
3. Strengthen operational human scoring
4. Where automated systems are modeled on human scoring, or where agreement with human scores is used as a primary validity criterion, fund studies to better understand the bases upon which humans assign scores
5. Stipulate as a contractual requirement the disclosure by vendors of those automated scoring approaches being considered for operational use
6. Require a broad base of validity evidence similar to that needed to evaluate score meaning for any assessment
7. Unless the validity evidence assembled in Number 6 justifies the sole use of automated scoring, keep well-supervised human raters in the loop
This is sound near term advice on the use of automated scoring in state testing systems. But for a more forward leaning view of the opportunity, see my interview with Cisco’s John Behrens who believes there is no excuse for sucky items. Automated scoring will make it possible for states to administer better tests less expensively. The real advance will be when assessment moves into the background.

For more, see Pearson’s Next Generation Assessment site and resources associated with their five recommended steps for state testing directors and policy makers.

	Beginning	Piloting	Implementing
From Fixed Standards to Competency-Based Mastery	BeginningFrom Fixed Standards to Competency-Based Mastery Beginning	PilotingFrom Fixed Standards to Competency-Based Mastery Piloting	ImplementingFrom Fixed Standards to Competency-Based Mastery Implementing
From One-Size-Fits-All to Adaptive Learning	BeginningFrom One-Size-Fits-All to Adaptive Learning Beginning	PilotingFrom One-Size-Fits-All to Adaptive Learning Piloting	ImplementingFrom One-Size-Fits-All to Adaptive Learning Implementing
From Prescribed Knowledge to Innovation & Inquiry	BeginningFrom Prescribed Knowledge to Innovation & Inquiry Beginning	PilotingFrom Prescribed Knowledge to Innovation & Inquiry Piloting	ImplementingFrom Prescribed Knowledge to Innovation & Inquiry Implementing
From Subjects as Silos to Transdisciplinary, Real-World Learning	BeginningFrom Subjects as Silos to Transdisciplinary, Real-World Learning Beginning	PilotingFrom Subjects as Silos to Transdisciplinary, Real-World Learning Piloting	ImplementingFrom Subjects as Silos to Transdisciplinary, Real-World Learning Implementing
From Schools as Self-Contained to Schools as Embedded in Communities	BeginningFrom Schools as Self-Contained to Schools as Embedded in Communities Beginning	PilotingFrom Schools as Self-Contained to Schools as Embedded in Communities Piloting	ImplementingFrom Schools as Self-Contained to Schools as Embedded in Communities Implementing
From Standardized Testing to Meaningful Assessment	BeginningFrom Standardized Testing to Meaningful Assessment Beginning	PilotingFrom Standardized Testing to Meaningful Assessment Piloting	ImplementingFrom Standardized Testing to Meaningful Assessment Implementing
From Institutional Ownership of Student Data to Student-Controlled Learning Wallets	BeginningFrom Institutional Ownership of Student Data to Student-Controlled Learning Wallets Beginning	PilotingFrom Institutional Ownership of Student Data to Student-Controlled Learning Wallets Piloting	ImplementingFrom Institutional Ownership of Student Data to Student-Controlled Learning Wallets Implementing

Advocacy

Advisory

Topics

Recent Releases

Discover the latest in learning innovations

Related Reading

0 Comments

Leave a Comment

Stay on the cutting edge!