How Students Show What They Know
My wife appreciates psychodrama (today we’ll see another triumphant women in a Nancy Meyer movie). I, on the other hand, enjoy a juicy psychometric-drama. And we have a thriller in the making. A couple agreements in the next few weeks will set the stage for the next decade of testing in America.
Here’s the plot outline:
- Moving goal posts: Common Core adoption will cause states to adjust student learning expectations
- High stakes: maturing state accountability systems use test scores to hold students and school accountable
- Increased stakes: federal grant programs are encouraging links between test scores and teacher evaluation
- One camp of leaders interest in locking in high standards in traditional “national” tests
- Another camp of leaders interested in progressive systems incorporating performance assessment (kids demonstrate what they know)
- Exponential growth of online assessment much of it embedded in games and learning software
- And about $4 billion at stake
Over the next two weeks, more than 30 states will wrap up a Race to the Top applications. They have been encouraged to focus on local formative and interim assessments (the kind that should help teachers improve instruction). Duncan would like to see the $350 million carved out for testing projects used for common summative assessments (i.e., end of year and end of course exams). But a group of state chiefs would like to use the big pot of cash to develop the next generation of assessment.
If this was a movie, it would include a couple cut aways to other developed countries that would illustrate the strange American reliance on bubble sheet assessment. It may also show young people instantly responding to performance feedback while participating in a virtual role playing game.
I’m no psychometrician, but I’m confident that the $350 million for test development could be used to build several innovative next gen assessment platforms that could incorporate content-embedded assessment, online adaptive assessment, performance assessment, and more traditional summative assessment.
There are two endings to this movie. The likely ending is that agreements are made to reduce the number of variables in play, lock in higher common standards, and encourage widespread adoption of common traditional “national” tests. This will slow the race to the bottom (making it easier to pass state tests to increase passing rates) and set the stage for an ESEA bargain supported by a ‘college ready’ business friendly consortium that could split both political parties. In this ending, we’ll need to be cautious about the risk of using old psychometrics for three distinct purposes: student, teacher, and school accountability.
There are a couple versions of the alternate ending—all a bit messier—including prizes and grants for innovative mostly online testing systems that easily incorporate learning games, writing assignments, and science projects. One version includes a marketplace where assessment systems compete for customers (states, districts, and networks). A similar version involves several multi-state consortia collaborating on assessment systems—some innovative, some progressive, some traditional. This variegated version leads to a broader coalition of support for reauthorization but worries equity advocates and gap closers. How would the feds ensure that states keep the good school promise?
We’ll soon know how this holiday drama will play out.
Zeev Wurman
As psychometric dramas go this might have been an interesting script, except that we tend to like our dramas to be fact based. And this one fails on this point.
"This will slow the race to the bottom" -- Fordham's 2007 "The Proficiency Illusion" clearly documented that there is no "race to the bottom" but, rather, a "walk to the middle." And for every state that pushed its standards down there are states that kept their standards high, like Calif. or Mass., or states that raised their standards, like South Carolina, or states that plan to raise their currently mediocre standards, like New York. In reality, the Common Core standards so far seem anything but high--they won't qualify students for admission to a 4-year college almost anywhere in the country. At best they will institutionalize mediocre goals across the whole nation--the classic danger of national standards and in direct conflict with American ideals of federalism and of states as the laboratories of democracy.
The belief that $350M will build us "several innovative next gen assessment platforms" is another departure from reality. Like the proverbial second marriage, it is the victory of hope over experience. Testing companies like ETS or the College Board have been trying to put together those "next gen" systems for years and their failure to do so is not due to lack of funds. If at all, since NCLB there have been sufficient funds allocated for testing to pay for such systems many times over. The issues are with the conflicting requirements of those assessments that no psychometrician can reconcile in good faith, yet the politicians insist on them. Specifically, you can't have little or no "bubble sheets" (real or virtual) and at the same time reliably sample the assessed domain so it will hold up in court. You can't have all children assessed on their "grade-level" content (to avoid dumbing down the test for disadvantaged populations) and at the same time have adaptive assessment. And few other wishful thoughts like that. As you graciously pointed out, you are no psychometrician. The $350M is a nice bundle for the beltway bandits, but at best will settle the whole nation into a morass of mediocrity. At worst, we will not even have Massachusetts to show us what is possible. Reminds me of the engineering joke about NASA, when it changed its slogan to "faster, cheaper, better" -- "pick any two." One Columbia shuttle disaster later, this was not a joke anymore.
Replies
Tom Vander Ark
I support the idea of voluntary national standards because it's silly to have 50 different standards and assessments systems. It improves the opportunity to invest in learning platforms with multi-state adoption potential. And, if you dispute the 'race to the bottom' national standards brings up the rear.
It sounds like we both agree that this is a political more than technical issue. I agree that several firms have piloted forward leaning assessment systems and the lack of adoption is more a political failure than market failure.
My main point is that there is a short window where a political bargain could be struck to encourage development of next gen assessment and the combination of $350m in grants and the potential of multi-state adoption would be more than enough to support development of several new platforms (family of aligned assessments on a common data warehouse).
Zeev Wurman
There is nothing particularly silly about having 50 sets of educational standards any more than it is silly to have to 50 sets of traffic laws, 50 state tax rates, 10,000 (or however many there currently are) sales tax rates, 50 sets of state laws, or, for that matter, 50 sets of state legislatures. We happen to be called the "United States of America" rather than just "America" for a reason.
In principle it might make sense to have a good set of standards that states voluntarily join. Perhaps two or three sets of competing standards, like we have SAT, ACT, and IB. In practice, what we seem to be getting is an exercise in mediocrity, and the states are adopting them not after a deliberative process, but instead they are mindlessly rushing to adopt them sight-unseen because of the economic crisis and a promise of a share in the $4B dollars federal handout. Not much different from the way the idiocy of 55 mph was imposed nationwide, or the 21 year drinking age.
As to your hope of 'next gen' assessment, I fear it will remain just that--a hope. The issue is not that the current assessments are bad. In fact, many are quite good. The issue is that any real assessment segregates students into winners and losers, and allows us to pass judgment on the quality of schools and teachers. Nothing is more hateful to teacher unions and to most school administrators and they will fight tooth and nail to undermine anything that comes their way; all the while loudly professing their deep interest in "assessment that is fair to students and teachers." They and their allies will just use the cost of such 'new gen' assessment as an additional argument why it should be abandoned because "it wastes scarce money on meaningless assessment rather than spend it on children." Have you ever seen a test that FairTest liked? Do you truly believe that such test can even exist? I wish I could join you in your hope. But I can't.
Replies
Tom Vander Ark
I appreciate your thoughtful response and also worry about the politics surrounding testing.
I'm most optimistic about content embedded assessment that provides realtime feedback to students and teachers. As secondary schools blend online learning with onsite support, we'll have an ocean of data some of which can produce scale scores and be included in progress models (ie, a credit equals 12 completed modules and summative assessment).
The next ESEA will shift from annual grade level testing to growth models creating an opening for incorporation of more adaptive and content-embedded assessment.
Zeev Wurman
Content-embedded assessment is an excellent and irreplaceable tool for guiding classroom instruction. I am with you on having more of it and on improving what we have. But the hope that such formative assessment can ever serve as a summative tool for accountability purposes is just a groundless hope. All we know tells us it cannot be both. The inherent problem is that formative assessment cannot have accountability consequences without subverting its formative value. So you will end up with an assessment that may be helpful to teachers but that will also declare everyone a "winner in his or her own way." Teacher unions will be happy. Teachers will be happy. And the number of dead-end schools and high-school dropouts in disadvantaged communities will stay the same, or perhaps even increase.
Thank you for the exchange. Despite our obvious disagreement I found it very informative.