Making Competency-based Assessment Work: Coherence & Rigor
Perversions of Rigor
The first section of this post frames ways that secondary schools have mislabeled rigor, ways that impact the arc of a single discipline like science over several years.
The second section explores what EduChange did to resurrect rigor via coherent systems design while building the competency-based assessment structures for our Integrated Science Program.
When I began to learn about the program of studies in secondary science in a small city in Massachusetts, a program I was hired to ‘reform,’ I saw various perversions of academic rigor that I would uncover again and again for the next two decades.
One perversion is the well-documented process of segregating students by tracking them into Honors, College Prep, Alternative and other designations.
The sorting process often begins in Grades 7-8, particularly in math classes, which then seems to dictate science course tracks.
This always amused me since we spend endless hours at EduChange incorporating applied math into our curriculum.
In truth, most of the so-called honors math students at any school I’ve worked with seem to be unable to walk down the hall to science class and remember how to do unit conversions or basic dimensional analysis.
Tracking is usually based on:
—grades, which are inherently inaccurate and inconsistent across teachers;
—teacher recommendations, which are inherently biased;
—and sometimes standardized entrance exams for high schools, which psychometrically include very large margins of error, particularly as it regards student development (new research forthcoming!).
Tracking is a well-known perversion of rigor and negates the growth model supported by competency-based assessment.
Schools cannot track students and employ competency-based assessment without creating conflicting messages, pathways and incentives within their assessment system.
The next perversion of rigor plagues entire secondary programs and involves early transitions to high/upper school.
There is a steep slope of academic difficulty in Grade 9 and sometimes Grade 8, often traversing too much ground across too many skill sets and concepts in too short a timeframe.
The wildly inaccurate message sent to students is that rigor is a function of volume: the total number of pages of reading assigned, the length of the required research paper, the number of problems to solve.
Harder teachers assign more work.
Schools that house Honors and College Prep students in the same classroom justify the weighted status by adding more work to the higher level.
Students willing to lose sleep, fight through the anxiety, and demonstrate compliance on every level are rewarded.
They learn to tolerate the volume and they carry the load throughout high school.
To whomever needs to hear this: doing more work per unit time may increase one’s efficiency, but has nothing to do with rigor.
Rigor is more aptly characterized by the negotiation of complexity; the analysis and evaluation of diverse viewpoints using evidence; problem-solving with empathy in the face of uncertainty; or designing for optimization.
In addition to lying to students about how well it prepares them for post-secondary endeavors, the volume-as-rigor approach fails to build lifelong learning dispositions by perverting the very process of learning itself.
Instead, the approach serves as the next sorting mechanism because students’ survival of this transition period proves they “can handle” AP or IB/DP courses later in high school.
I find there is truth to this, but not because these advanced courses are more authentically rigorous.
First-year undergraduate courses, particularly in STEM, share the goal of weeding out “unfit” students, and the process is more ruthless at the university level.
You can read a summary of how I worked with my Massachusetts colleagues to address rigor-as-volume and other programmatic issues back in the late nineties.
I wish I could say that those issues no longer exist, but our methods could easily apply to many schools today.
A third programmatic perversion of rigor occurs later in high school.
Authentic rigor plateaus after Grade 10, and students who successfully negotiated the rites of passage in Grades 8-10 are insufficiently challenged.
This phenomenon exists even in AP and advanced classes, where volume of work rather than developmentally-appropriate intellectual challenge continues to drive curricula.
Students were sorted for what purpose, then?
To do more of the same, just in different courses?
There is nothing more indicative of the factory model of schooling than this slog of low-grade task completion for it’s own sake.
And the perpetuation of homogeneity cannot possibly serve diverse populations.
I truly believe this plateau is partly responsible for the documented erasure of critical thinking gains by the end of high school.
Unfortunately, this upper-secondary plateau has persisted for so long, and is so veiled by the prestige of the exam courses society craves, that it becomes a lightning rod in conversation.
Speaking from personal experience, it is impossible to question the rigor of IB/DP, AP or favorite teacher-designed electives in Grades 11-12 without incurring the wrath of many adults.
Teachers, parents & administrators believe these courses represent the highest level of achievement at their school, and point only to the standardized exam as evidence of rigor.
Oddly, they justify the rigor of the course whether student scores are consistently high or low.
And the lack of an AP designation in a given subject at a given school effectively devalues the subject matter for older students.
The College Board, university admissions and tiger moms have marketed AP courses with unmatched efficacy: 359,120 students took AP exams in 1990 compared to 2,825,710 students in 2019, for a 687% gain.
However, my own review and observations of student work, teacher-designed tasks, chosen texts and observations of the day-to-day classroom experience in over 1100 classrooms across Grades 8-12 point to an absence of increasing intellectual challenge in the final two years of high school.
In addition to these longitudinal data, my article shares additional data on STEM readiness of USA high school students. This tragic missed opportunity is a prime motivator for EduChange’s assessment design work.
We’ll let you know when the next post is ready for you!
Resurrecting Rigor Through Coherence
We designed our way out of this inequitable, ineffective mess by building eight coherent systems that interface to complement each other so that a diverse range of students and teachers can grow authentically through rigorous, anxiety-free classroom experiences.
And, we designed all of it to help trapped schools break free from the ties that bind.
Now I’ll describe our designs for multi-year coherence, and how they support Vygotsky-esque incremental increases in rigor.
Coherence is not standardization, nor is it a free-for-all.
Coherent systems contain very distinct parts that remain stable, in our case the sequence of overall modules and our master rubrics.
Coherent systems also include flexible parts, which we have built into day-to-day classroom decision making, technology integration, disciplinary literacy, task rubrics and the diverse ways that students can achieve success via our assessment system.
Over the years, some teachers have disregarded the stable components, and of course the system falls apart for them.
Teachers are most familiar with the Pinterest approach to curriculum development because they are not provided with high-quality resources.
They must fend for themselves in—guess what?—too little time to design for coherence!
Feeling a real system evolve in your own classroom, and making modifications that jive rather than fight with it, is not intuitive for teachers but certainly can be learned.
We provide ongoing professional development to support individual teachers, most of which involves immersive, on-the-job experience rather than pull-out PD workshops.
EduChange employs six master rubrics, four of which are used in Level I and two supplemental rubrics that are added in Level II.
Every performance task is built using these rubrics and only these rubrics.
No specific content is used in master or task rubrics.
This summary image gives you a sense of how these four Level I realms apply to any content. It’s one reason why ~15% of our task and content designs may be updated annually.
There are just over 70 discrete competencies that may be recombined to build task rubrics.
It’s the right amount to ensure that students who are assessed in a balanced fashion can build learning profiles for each discrete competency over four years (i.e., many more do not fit into the science class time available in four years).
Multi-year coherence is rare in secondary programs, mainly due to one-year-one-subject conventions.
We have realized enormous learning benefits by opening up disciplinary architecture over 2-4 years.
Revisiting concepts, skills and dispositions in different contexts helps students appreciate their strengths, gain practice where they are less confident, and prepare to articulate specific, desirable competencies to future professors, employers or commanding officers. Here are the messages our assessment designs send to students.
It is critical to understand that all eight interfacing systems that deliver our program are designed in concert.
We do not design in a linear fashion, unlike most one-grant-one-module endeavors. Even if those same authors build more modules, their resulting sets almost never build sequentially or systematically to something bigger.
We believe coherent, growth-based learning systems should do just that.
Since students experience the program in a sequence as they progress through school, we gradually push them toward their ever-changing zone of proximal development.
We architect rigor in our performance tasks by gradually increasing:
—the complexity of the texts, problems, laboratory materials, or scenarios we use;
—the number of competencies students must demonstrate;
—the number of realms (across the master rubrics) that students must navigate;
—the time it takes to complete the task (accommodations notwithstanding);
—the prior knowledge and skills that students are expected to marshal; and
—the degree of student autonomy and subsequent responsibility for completion.
Students feel task rigor ratchet up alongside the coherence of the stable rubrics, which in turn gives them confidence to trust their growth.
If that is not the point of competency-based assessment, I’m not sure what is.
Ultimately, students take charge of their own developmentally-appropriate STEM project, conduct it while engaging in a series of peer review opportunities, and produce a complete scientific paper.
We have partnered with JEI, which provides a free, online journal for middle and high school students, replete with a team of Ph.D. candidates in different specialties who peer review the manuscript and offer feedback prior to publication.
Unlike a capstone, students may enter this process any time they believe they are ready to attempt it, and may do so more than once.
Next time I’ll discuss how the practice of biomimicry played a role in our assessment system designs…