Dr Mark Shermis is Dean of Education at The University of Akron. He’s the author of Classroom Assessment in Action and an expert on automated essay scoring. Shermis is the academic advisor to the Hewlett Foundation funded Automated Student Assessment Prize (ASAP), a project managed by OpenEd. ASAP started with a February demonstration of current vendor capabilities. Dr. Shermis reported that the nine vendors scored thousands of student essays from eight data sets with high levels of agreement with expert graders.
This morning Dr. Shermis address a question from a reader of a recent NYTimes story (covered here) who expressed concerned that online assessment programs may be “geared towards students who are middle-class, white, and less attention is paid to the interests, attitudes, and communication/language styles of people who might comfortably describe themselves using any of the following labels: Low-income, Black, Homosexual, Feminist, etc.”
Thank you for your letter of concern. First of all you should know that the automated essay scoring software will ultimately make it easier to administer more writing assignments, raising the literacy bar for all school children, not just the privileged students that you allege in your opening paragraph. The technology is relatively inexpensive (about $15 per year per student when purchased through a school district), though it does require access to a computer. We are sensitive to the costs associated with computer access, but most schools that take advantage of the technology administer writing assignments in a school media lab.Typically the scoring systems are based on models derived from human raters. This can a benefit or a bane. If the human rater scoring is biased against any of the groups you mention, it will be reliably reflected in the models that are developed. However it is possible to identify the factors that may contribute to human rater bias and adjust them with the machine scoring models. For example Michigan’s high-stakes assessment MEAP instructs human raters to ignore expressions of non-standard English (e.g., Black English), but no matter how well the raters are trained, they will inevitably assign lower scores to essays that utilized non-standard English. It is theoretically possible to adjust the machine scoring models to be more consistent with the test’s instructions (to ignore the dialect).Finally, the writing instructional systems that use automated essay scoring typically have features that may communicate better with the heterogeneity of students out there. So for example, Hispanic students can get feedback in Spanish even though they may be writing their essays in English.
For more, see these GettingSmart posts: