Pathological Explanations for Learning Differences: Untangling the Signal from the Noise to Measure Ability in the Age of AI

Hand Isaac Newton a tablet for a physics exam today, and he would bomb itโ€”his brilliance obscured by an inability to navigate a simulation. We educators commit this malpractice against neurodivergent students daily, diagnosing their minds as deficient instead of our broken tests.

Myth vs. Science

Pamela Cantor, M.D., notes that 20th-century education rests on an assessment architecture designed to sort students through a deficit lens, anchored by myths that genes dictate destiny, talent is scarce, and capability can be cleanly ranked and sorted on a bell curve. Yet contemporary learning sciences prove human potential is profoundly malleable, abundantly distributed, and dynamically shaped by the environments we provide. Every child is wired to learn and grow.

Why do our systems tell a different story? Flawed measurement. Forcing neurodivergent learners into rigid standardized tests too often measures barriers, not cognitive ability. This can breed โ€œpathological explanations,โ€ falsely blaming learning differences rather than recognizing the limitations of the tests themselves.

Untangling the Signal from the Noise

In psychometrics, a poorly designed test obscuring a brilliant mind creates Construct Irrelevant Variance. In Architecture of Ability, we contend that quality measurement captures the โ€œSignalโ€โ€”the focal abilityโ€”and filters out the โ€œNoiseโ€ of barriers and time limits. To address noise, we must embrace three foundational propositions.

Proposition 1: Fairness does not mean identical.

For a century, the testing industry assumed fairness meant standardizationโ€”forcing every student to complete tasks under identical conditions.  This makes little sense for neurodivergent learners. Indeed, unidentified โ€œnonequivalent surface conditions provide nonequivalent evidence about learnersโ€ (Mislevy et al., 2013).

Consider the spelling bee. The โ€œSignalโ€ measures orthographic knowledge, but the format rigidly demands speaking audibly in front of an audience. For a student with a severe speech impairment, the noise of that barrier drowns out their spelling ability. The contest ceases measuring what they know, measuring the โ€œconstruct-irrelevantโ€ hurdle of their impairment instead. Allowing that student to type on a keyboard alters surface delivery without diluting the spelling beeโ€™s rigor. Intentionally varying task interaction strips away irrelevant barriers, enhancing validity and giving each learner an unclouded lens to demonstrate their capability.

Proposition 2: Equivalent surface conditions may not provide equivalent evidence.

Standardized testing assumes that identical interfaces ensure accurate data. However, equivalent surfaces do not generate equivalent evidence. Consider an eye exam versus an alphabet test. If an optometrist measures visual acuity, standardizing a twenty-foot distance to the eye chart is prudent; visual distance is the measured construct. But if an educator evaluates alphabet knowledge, forcing a visually impaired child to read the board from twenty feet away is measurement malpractice. It merely tests their eyesight. This visual demand introduces massive construct-irrelevant variance, becoming the primary โ€œalternative explanationโ€ for failure. The exam stops measuring the alphabet and measures eyesight instead. Forcing students to use the exact same interface introduces construct-irrelevant barriers.

Proposition 3: Principled variation can provide equivalent evidence.

Assessment design and development should embrace a third insight: intentionally adjusting surface conditions (e.g., task delivery) in evidence-based ways for different learners can provide equivalent evidence. To illustrate why this principled variation can be essential, consider a modern educational computer simulation designed to assess physics, such as Newtonโ€™s Playground. For a tech-savvy student, this digital environment offers an engaging medium to demonstrate mastery of Newtonian mechanics. But imagine if Isaac Newton himself were subjected to this assessment, he would almost certainly fail. Despite having authored the theorems, Newton lacks the โ€œAdditional KSAsโ€ (Knowledge, Skills, and Abilities) required to operate a computer. His physics genius (the Signal) would be obscured by the format (the Noise), yielding a false negative. By providing principled alternatives for how a student accesses and interacts with a task, we ensure that we are measuring their underlying brilliance, rather than their fluency with our chosen delivery mechanism.

The AI Fork in the Road

For decades, construct-irrelevant โ€œnoiseโ€ was a tragic flaw hidden in Scantron bubbles. Today, AI gives us unprecedented computational power to break the mold. Because AI can dynamically adjust a task’s delivery in real-time (e.g., simplifying syntax or offering on-demand audio scaffolding), we have the scalable tool needed to personalize assessment and eliminate construct irrelevance once and for all.

But AI is, at its core, an inference engine. It will simply turbocharge whatever inferences we ask it to make. We are at a critical fork in the road, facing two divergent futures.

Path B is the Scaling of Harm

If we blindly feed broken testing approaches into algorithms, AI will automate subpar measurement. It will power โ€œpathological inferences,โ€ sorting neurodivergent learners into failure faster than ever and cementing incomplete narratives with algorithmic authority.

Path A is Meeting the Promise

If we tune AI to strip away the noise and isolate the signal, learners, educators, and families receive the individualized feedback required to pursue continuous improvement and human thriving.

If a test requires overcoming a barrier unrelated to the subject, it is a poorly designed test. By changing our psychometric blueprints, we can shift from a deficit mindset that pathologizes learners to a design orientation that empowers them. We must stop asking, โ€œWhat is wrong with this student?โ€ and start demanding, โ€œWhat about this assessment must be improved?โ€


Mislevy, R. J., Haertel, G., Cheng, B. H., Rutstein, D., Vendlinski, T., Murray, E., Rose, D., Gravel, J., & Mitman Colker, A. (2013). Conditional inferences related to focal and additional knowledge, skills, and abilities (Assessment for Students with Disabilities Technical Report 5). SRI International.

Erick Tucker Advisory Board

Eric Tucker

Eric Tucker is President of the Study Group. He has served as CEO of Equity by Design, Cofounder and Superintendent of Brooklyn Laboratory Charter Schools (including the Edmund W. Gordon Brooklyn LAB School), Cofounder of Educating All Learners Alliance, Executive Director of InnovateEDU, Director at the Federal Reserve Bank of New York, and Chief Academic Officer and Cofounder of the National Association for Urban Debate Leagues. He has taught in Providence and Chicago.

Discover the latest in learning innovations

Sign up for our weekly newsletter.

0 Comments

Leave a Comment

Your email address will not be published. All fields are required.