What if Proficiency is not Proficient? The Case for Calibration

Key Points

  • Calibration is critical for ensuring consistent, equitable assessments in Competency-Based Education (CBE), helping schools maintain coherence and trust.

  • Involving students in calibration fosters learner agency, allowing them to recognize and achieve quality independently, a crucial skill for lifelong learning.

Here’s an uncomfortable question: If three teachers in your building assess the same student work using the same rubric, will they arrive at the same proficiency rating?

If you hesitate, you are not alone. If you’re implementing personalized competency-based assessments without calibration, inconsistency undermines everything. Systems that are building toward coherent, personalized learning are undermining themselves through inconsistent judgment. We can build beautiful “learner-centered” tools and resources, but without a shared interpretation of quality, we are building on sand. 

When a parent questions differing scores for similar work, when a student transfers between classrooms, or when equity concerns surface, the structure defaults back to the inconsistency we were trying to solve.

Calibration isn’t the secret ingredient that makes CBE work. It’s the foundation and the maintenance schedule. And if we’re serious about learner agency, it’s ultimately what we support students in doing for themselves.

Grading Islands

Too often, schools treat calibration as an optional workshop. Calibration is not about “grading the same.” It’s about building shared meaning around what quality looks like. It’s about creating portability of learning evidence across classrooms. It’s about trust in our judgments, in our assessments, in our system. It makes a diploma and grades mean something.

Without it, every classroom can become its own island. Students don’t know what they are aiming for, and the target shifts depending on who is holding the rubric. Teachers? They are left carrying the entire mental load of defining quality alone, classroom by classroom. Educational leaders come to see systemic data as unreliable. Without calibration, school-wide data is just a collection of individual opinions, making it nearly impossible to defend grading decisions to parents or identify true achievement gaps. CBE doesn’t fail loudly; it fails quietly, one inconsistent rating at a time.

This approach, also commonly referred to as Evidence-Based Grading (EBG), has long been a transformative practice for creating equitable learning practices. Standards-Based Grading (SBG) and EBG are used interchangeably in the education landscape, although we observe that the SBG approach often focuses on the input (standards) while EBG primarily focuses on the output (evidence). Regardless, both required standards and evidence against the standards. It is foundational for personalized competency-based learning systems. The process of focusing on learning evidence rather than points shifts how learning is reported, so grades truly reflect what students know and can do. This belief in practice (a foundational design principle)  is at the core of CBE. Supporting this work, educators use professional judgment to interpret patterns, and students can reassess as they grow toward mastery. 

What does this look like in practice? Consider an orchestra. Musicians don’t tune their instruments in September and call it a year. They tune before every rehearsal and for every performance. Tuning isn’t a special event; it’s embedded in the practice. It’s how the music stays coherent. Calibration works as a continuous loop: apply a rubric, compare the results, refine the criteria, and adjust instruction.  

Peer Calibration

We don’t need a district-wide mandate to start calibrating. We just need one colleague and 15 minutes. 

  1. Hand-Off: Pick a piece of student work from a recent “middle of the pack” collection. Print or make a copy and block out the student’s name to ensure an unbiased review.
  2. Score it Blind: Without sharing your assessment, ask your colleague to score it using a shared rubric or scoring guide.
  3. Debrief: Compare your scores. If you disagree, this is the gold mine. Don’t try to win. Instead, ask “What did you see that I may have missed?” or “Is our rubric language too vague here?” This 15-minute conversation helps improve practice, refine the system, and ensure every student receives equitable, high-quality feedback.

Student Calibration

How Shared Judgment Becomes Learner Agency

The ultimate calibration boundary isn’t the Portrait of a Graduate, it’s the learner. If shared interpretation stops with educators, students are still on the outside looking in. But when do learners develop their own discernment? That is when the system becomes something they can carry with them.

When students are brought into this process, something shifts. They study examples and start recognizing quality across different levels. They practice with peer work first before turning that same lens on their own work. Eventually, they are revising independently, not waiting for someone to tell them what is good or proficient.

Self-assessment without shared understanding is an activity in guessing. But when students develop an eye and ear for proficiency, reflection becomes meaningful, and lifelong learning becomes possible. 

Consider the following.

  • Test the Language: If students can’t use the rubric to reach the same conclusion as their peers, the language needs to be more student-friendly.
  • Diagnostic Power: Discrepancies in student-led calibration provide instant formative feedback to the teacher on where the skill gap lies.
  • Deepen Agency: This practice teaches students how to review their own work objectively, turning “feedback” into a skill they own rather than a comment they receive.

AI Assist Calibration

While most traditional calibration involves one or more educators comparing proficiency determinations, creative use of AI tools can be useful. We prompted an AI tool with the following: 

Attached is an 11th-grade essay and an essay rubric. I am working to calibrate my assessment. Can you assess this paper thirty times using the essay rubric and give a data table of all thirty assessments and a summary table of your assessment calibration?

The outcomes provided useful information on assessment criteria: those with greater scoring variation (more subjective language) and those with less (more objective). The power of this analysis was to identify weaknesses in the criteria that might lead to greater difficulty in proficiency assessment.

Given the importance of calibration and the extended time commitment required, AI-Assisted Calibration delivers the efficiency schools and districts need to complete this critical work.

What Leaders Can Do

If you’re implementing Evidence-Based Grading, calibration isn’t optional. It’s the infrastructure that makes Personalized Competency-Based Learning work. Leaders need to treat aligned judgment as essential, not a nice-to-have. Consider adding these questions to your conversations with teaching staff and administrators:

  1. The Reality Check: “On a scale of 1–10, how confident are we that a student’s grade in this building is independent of which teacher they were assigned? What evidence do we have to support that number?”
  2. Identifying the Silos: “Where in our current schedule do teachers have the ‘permission’ and the time to look at student work together rather than just discussing curriculum or logistics?”
  3. The Tool Audit: “Look at our most-used rubrics. Are the descriptions of ‘Proficiency’ specific enough to pass the ‘Blind Swap’ test, or are they filled with subjective words like mostly, often, or good?”
  4. Student Ownership: “If we walked into a classroom tomorrow and asked a student, ‘How do you know if this work is a Level 4?’, would they describe a specific quality of work, or would they say, ‘Because my teacher likes it’?”

Protect time for collective review of evidence. In larger CBE systems, this IS the work, not an add-on. Model curiosity over certainty; disagreement is productive and reveals where rubrics need refinement. Normalize this work as ongoing rather than episodic. Systemic calibration is strongest when it is scheduled, not squeezed in. Invest in clarity around your rubrics. Provide strong exemplars that support both teacher and student understanding. Signal that human judgment, collectively refined, is valued. 

Rebecca Midles

Rebecca Midles is the Chief Impact Officer at Getting Smart and is an innovator in competency education and personalized learning with over twenty years of experience as teacher, administrator, board member, consultant and parent.

Nate McClennen

Nate McClennen is CEO of Getting Smart. Previously, Nate served as Head of Innovation at the Teton Science Schools, a nationally-renowned leader in place-based education, and is a member of the Board of Directors for the Rural Schools Collaborative. He is also the co-author of the Power of Place.

Discover the latest in learning innovations

Sign up for our weekly newsletter.

0 Comments

Leave a Comment

Your email address will not be published. All fields are required.