A Few Thoughts about Teacher Evaluations without Testing Data

There’s a bill on MI Governor Gretchen Whitmer’s desk that would—among other good, constructive things—end the mandated practice of evaluating teachers using their students’ test data. Most recently, 40% of a MI public school educator’s annual evaluation was tied to standardized test scores. Getting rid of that (and the myriad ways that it was subverted, in practice) is definitely a policy upgrade. Hallelujah.

The next question is always: What will replace test data-based teacher evaluations? And that’s a vastly more interesting topic—how and why should teachers be evaluated? You need both the how and the why to think out of the checklist-of-skills box here.

Yes, teachers’ work must be monitored and assessed, to meet a minimal level of competence, as a critical, publicly funded service. But—are there ways to enhance teachers’ effectiveness embedded in an evaluation process? Is it possible to improve teachers’ practice in the process of checking on them? That’s the gold standard. But is it reality?

From a pretty good piece on the proposed changes in teacher evaluation in Michigan:

Districts would be able to use their own criteria for evaluating teachers, such as classroom observations, samples of student work, rubrics, and lesson plans.  Critics in the legislature (and let’s identify them by name: Republicans) say this plan is merely a return to what used to be—the good ol’ days when teachers got away with substandard practice because there were no teeth in district evaluation processes. But there is something to be said about evaluation criteria tailored to a teachers’ context and teaching assignment. After a couple of years, most teachers could, for example, write their own goals—and a first grade teacher’s goals around student literacy might be very different from the HS art teacher’s portfolio of student work samples.

The bills would de-emphasize evaluations as a factor in districts’ decisions to fire or demote teachers or deny them tenure. Districts seem to have little difficulty firing or shaming teachers these days, for all kinds of stupid, politicized reasons. For teachers lucky enough to have tenure, it’s not the guarantee of a job for life, as the right wing might have you believe. If a district wants and needs to fire a teacher, it’s possible to do that without testing data—in fact, taking away the test score excuse puts the onus on a district to do the work of gathering evidence of a teacher’s unsuitability.

The bills would require districts to take action against teachers who don’t improve after repeated interventions.  As districts should, even in a time when qualified teachers are not thick on the ground.

Michigan’s law on test scores and evaluations grew out of a push for greater accountability in education that began in the 2000s. Some advocacy groups theorized that more rigorous reviews would generate detailed feedback that could be used to improve teachers’ performance. The language here is interesting—“some advocacy groups” and “detailed feedback.” Not much happens when an education advocacy group has a good idea (like using a non-threatening evaluation and mentoring process to build teacher capacity). The groups that get traction for their ideas do so through legislation. And only legislators bent on proving that lack of accountability is the problem would believe that standardized testing data represents detailed feedback on how well a teacher is doing in the actual classroom.

Critics are concerned that de-emphasizing student test scores could lower standards for teachers while students are still struggling to recover from pandemic learning loss and need high-quality instruction. This concept—“lowering standards for teachers”—is comical in a time when districts are thrilled to be re-hiring retirees, rather than relying on uncertified long-term subs. And shame on any media outlet for buying into the idea that students are suffering from learning loss (as measured by the same testing data that doesn’t identify outstanding teaching). Of course, students need high-quality instruction. They have always needed high-quality instruction. Let’s figure out a way to give them more of it.  

Proponents of returning to the old evaluation method say there is no evidence to suggest the current system benefits students, and that tying ratings to test scores contributes to burnout amid persistent teacher shortages. Bingo. Although it’s optimistic, one can hope that districts will take the opportunity to modify their ‘old’ evaluation systems, since tying ratings to test scores didn’t get our hard-working teachers to miraculously raise either. Ten years of data analysis, no uptick in scores.

Let’s go back to the gold standard: an evaluation process that would improve teacher practice. It’s a huge—and evergreen—issue in building teacher capacity, and there is no shortage of, umm, advocacy groups that would like to take on that challenge (and then sell their thinking, materials and workshops to districts).

I’ve always been a fan of the National Board for Professional Teaching Standards’ assessment, but it’s complex, time-consuming and expensive, and needs to be renewed every five years. Moreover, it was created for experienced teachers—an opportunity to stretch and grow and be recognized for your exemplary practice. It’s not designed for novice teachers, and it’s especially not useful for teachers who are struggling.

It’s worth noting that tracking student test data to teacher performance is not only useless, but expensive. Freeing up districts to create their own, multi-phase, divergent forms of evaluation could be a cost savings (but a lot of different work). While buying an off-the-shelf evaluation system is the easiest option, a high-functioning district, with adequate staffing, might be able to do some good work, building a system that runs from an intentional induction phase, to using skilled veterans to partner with newer teachers.

When I think about teacher evaluation, I remember Deming’s first principle: First, drive out all fear. If changing the law does nothing more than that, it will be a success.

The bill dumping test scores as a mandated factor in teacher evaluation is part of a wave of edu-changes including making charter school teacher salaries public, eliminating third grade retention for students testing below grade level in reading, ending the practice of giving schools letter grades and giving underwater districts debt relief.

All of these will matter in the fight to improve and strengthen public education. Just as much as teacher evaluation.

Teachers—or Teacher Unions? Or maybe—Neither.

You see it all the time, in the media.

How Teacher Unions became a Political Powerhouse

Republicans grill teachers’ union head on COVID classroom closures

How Teacher Unions Failed Students during the Pandemic

And this nasty little bit of hyperbole:
How the Teachers Union Broke Public Education

Those unreasonable, greedy, demanding teachers—umm, unions– insisting on masks and ventilation during a lethal global pandemic. Boldly asking for wage increases, that bring them closer to other employees with college degrees and a desirable skillset.

But what about that delightful third grade teacher who let your shy daughter know that her drawings and poems were amazing, building her confidence? Or the HS Math teacher who wrote four letters for your son, getting him into Michigan Tech, his life’s dream?

Well—those are individual teachers. The good ones. Not the union. Which is evil. (Since sarcasm often doesn’t translate well in blogs, I am compelled to point out flaws in the “teachers aren’t unions” dichotomy.)

A few points:

  • “The union” is made up of teachers, not “bosses” or—insult alert! —“thugs.” Teachers. Local unions are led by local teachers, a large majority of whom are also full-time in the classroom.
  • Only 31 of the 51 states (and D.C.) have collective bargaining privileges.While other states have chapters of professional associations, including but not limited to affiliates of the NEA and AFT, bargaining is limited or prohibited. Associations exist to protect teachers and provide things that teachers need, from insurance to professional development—things they would get under a collective bargaining agreement.
  • In states with stronger unions and collective bargaining privileges, the bargaining happens at the district level, often between employees of the district—colleagues. Which is as it should be—making joint decisions about best use of available resources, in the best interests of both the students and the adults who organize and deliver education. Of course, this process is messy and fraught, but tax-supported public goods and services are often messy. It’s called democracy.
  • Things that are good for teachers (a health-conscious environment, adequate materials and resources, an orderly school climate, a threat-free atmosphere, respect for teacher judgment) are also good for all kids.
  • Who to fire first in an economic downturn?  The temptation to fire the most expensive employees is always present, in any business. Experienced employees often cost more; there are reasons experienced folks are kept on—their ability to manage difficult customers or tolerate uncertainty. Sometimes, it’s a matter of honoring loyalty and accrued skills.

So the Mackinac Center is dead wrong when it writes:  Merit pay systems allow a school district to pay teachers according to their performance. The teacher who performs well and teaches students effectively is likely to be rewarded with higher pay. The teacher who consistently underperforms is dismissed.

Measuring teacher performance via test data is impossible. Tests and scores are deeply flawed. And one family’s genius teacher who saved Jason is another family’s weirdo with a ponytail.  There are teachers who underperform, even teachers who should be fired. And that decision should be made by the district that hired the teacher, not a grid comparing student testing data. Pitting teachers against one another for salary bonuses is a recipe for disgruntlement. And invites cheating.  Not to mention shutting down the already-shaky qualified teacher pipeline.

So why are politicians—OK, Republican politicians—claiming we need to break the back of the teachers’ unions?How can they praise individual teachers as essential workers but excoriate the associations that represent them? Isn’t that incoherent thinking?

I was struck by Representative Brian Mast (R—FL)’s post this week, claiming: Unions worked hard to keep parents out of their children’s classrooms and have gone so far as to treat concerned parents as domestic terrorists for speaking up at school board meetings.

 Mast pumps up the House Republicans’ Parents Rights bill:

Here are the five basic rights the House Republicans outlined:

  • Parents have the right to know what’s being taught in schools and to see reading material.
  • Parents have the right to be heard.
  • Parents have the right to see the school budget and spending.
  • Parents have the right to protect their child’s privacy.
  • Parents have the right to be updated on any violent activity at school.

So here’s the thing. Parents have always had the right to know what’s going on in their public schools, and have always been invited to attend school board meetings (unless the people THEY ELECTED are meeting in secret—in which case, it’s not a Congressional problem). They have always been able to share concerns about curriculum—from constructivist Math to Sex Education—and vote on school taxation initiatives. I only WISH that more parents were worried about protecting their child’s academic testing data—the scariest privacy issue in 2023.

School administrators and board members loathe being publicly called out or yelled at; they are forced to be responsive to parent commentary—it’s their job.

And very little of this—the rights of parents–has anything at all to do with local teacher unions, who function as a convenient scapegoat, a collective noun that allows those who would like to see public education destroyed point fingers at someone, anyone, and call them a terrorist.

For shame.  

Would You Recognize a Good Lesson If You Saw It?

Here’s a scary headline: Michigan Democrats Look to Change Teacher Evaluation System.

Not so much the “Democrats” part—although I’d argue that not having a clue about evaluating teachers is common in both parties—but the implication that way fewer than 99% of public school teachers are doing acceptable work:

Consider: During the 2021-2022 school year, 99 percent of Michigan teachers were ranked either highly effective or effective on evaluations.

State Rep. Matt Koleszar, D-Plymouth, chair of the House Education Committee, told Bridge Michigan the state’s teacher evaluation system often leads to school administrators “checking a box” as they monitor teachers rather than using the process to help struggling teachers improve.

“I think when you have a better evaluation system and you’re supporting someone who needs that help and needs (those) resources, that ultimately is going to (filter down) to the student.”

I am decidedly NOT a fan of basing any percentage of a teacher’s evaluation on standardized test scores (it’s 40% in Michigan, under our current, Republican-developed system). And I am a true believer in the statement that teacher practice can be improved—and a good evaluation system (plus—key point—the time, trained personnel and resources to implement such a system) could help.

With so many moving parts, and the current handwringing (and bogus data) around low test scores in students emerging from a global pandemic, re-doing teacher evaluations which might be in place for decades seems precarious at the moment.

The questions, really, are: What are we looking for, in a teacher? What skills and qualities do good teachers exhibit—and are they measurable, with the tools we currently use? What outcomes are most critical for students—and what (easily measured) outcomes disappear quickly?

When the legislature can agree on answers to these questions—with input from the education community and invested parents, of course—let me know. Cynicism aside, how do we streamline teacher evaluation in ways that make it easy to capture and share expertise, help promising teachers build their practice, and excise the folks who shouldn’t be there?

There is, by the way, no shortage of ideas and research around teacher improvement; our international counterparts are already doing a better job of this. Anyone who’s looked at Japanese Lesson Study models, or meta-analyses on building effective learning environments knows this—but investing in viable teacher evaluation systems that also build capacity will not come about with a new written tool or protocol. It will take a new mindset.

Because I spent many years looking at videos of music teachers, while serving as a developer for the National Board’s music assessment, I also understand that there are limitations in evaluating teachers by observing their lessons.

For example: You have to know what the teacher’s learning goals were, going into the lesson, and have some context around who’s in the room. The core competency for nearly all teaching is knowing the students in front of you. You can’t build effective lessons without that knowledge. And that’s hard to evaluate.

I used to teach with a man who didn’t bother to learn the students’ names, because the classes were large—60 or more. His rationale was that learning names was time that could be better spent delivering content. He delivered a whole lot of content, all right, but never got great results, because there was no human relationship glue inspiring students to use that content.

Try to put that into an evaluation tool.

Dr. Mary Kennedy, one of my grad school professors, had a video library of teachers teaching. She would usually show two videos, and then ask us to compare and contrast—and roughly evaluate.  One pair of videos (and discussion) that I remember:

  • A man in a Hawaiian shirt, cargo shorts and flip-flops is facilitating a hands-on science experiment with a half-dozen groups of middle school students, clustered around lab tables. The room is noisy as students manipulate equipment and fill out lab reports, but the teacher is wearing a mic that picks up his comments and students’ questions as he moves from table to table. Several times, when students ask a direct question, he turns it back to them—What do YOU think? Why? Once, he claps his hands and asks the entire class to re-examine the stated purpose of the experiment. There is a beat of quiet, and then students are back to talking and writing. The video picks up students who appear to be off-task, as well, looking at the camera or talking to someone at another table.
  • A young woman is teaching a HS literature class. She is well-dressed and very articulate. The video begins with a Q & A exchange about the assigned reading, with a young man wearing a navy blazer and tie. The questions probe facts from the text—Who is the real victim in this chapter? Does this take place before or after the barn-raising scene and why is that important to the narrative? —and the young man has clearly done the reading, as his answers are all correct. The camera moves back and we see there are about eight teenaged boys in the class, all in blazers. She cold-calls the students, in turn, and they all answer her questions correctly. Other than the questions and short answers, the class is silent.

After watching the two videos, Dr. Kennedy asked: Which was the best lesson? Who was the best teacher? The class was vehemently divided—and remember, these were all graduate students in education. Imagine showing two similar videos to a legislator or one of the Moms 4 Liberty— then asking them to pick out the “best” teacher.

Ironically, the current quest to limit controversy and hot topics in public school classrooms makes it even more difficult to evaluate teacher practice. The best lessons—the ones that stick—are often messy and hard-won. And our best teachers—articulate, student-focused and creative—are being shut down by the very people designing their evaluation procedures.

We used to laugh at the inadequate teacher evaluation checklists—Is the teacher dressed neatly and well-groomed?—prevalent in the 1970s. But we haven’t solved the problem of how to evaluate all teachers fairly and productively. Yet.