How to help teachers improve? A new system of in-depth observation by trained evaluators and principals, soon to be required in schools across Washington, shows what can help.

Share story

After playing outside on a blustery December morning, the toddlers at Happy Tots Childcare in Renton plopped down for circle time.


Education Lab is a Seattle Times project that spotlights promising approaches to persistent challenges in public education. It is produced in partnership with the Solutions Journalism Network and is funded by a grant from the Bill & Melinda Gates Foundation.

· Find out more about Education Lab  

More about Education Lab

What had they noticed in the yard, their teacher, Benaz Amin, asked.

“Wind?” she prompted. “Remember? It blew all the toys around.”

They smiled, but said nothing, so she moved on.

Watching from a corner of the room, evaluator Elaine Jackson made a mental note: Amin had missed an opportunity for the kind of conversation that builds learning.

Amin might have asked the children, for example, what the wind did to the trees or whether they had ever lost power at home.

Jackson is part of a growing effort to solve a tough problem: how to accurately and fairly critique teaching in ways that helps teachers improve.

The art of teaching has long been considered something of a black box — a matter of personal style, intuition and philosophy that couldn’t be defined, much less reliably measured.

But now the lid is starting to come off.

Trained observers like Jackson — armed with elaborate guides that describe what good teaching looks like and how to rate it — are changing the way teachers are evaluated, not only in preschool but in K-12 classrooms.

These new, in-depth observations are replacing or supplementing the ways teachers have been judged in the past, most often with superficial visits by school principals.

The goal is to make teacher evaluations more objective than a principal’s opinion and more useful for self-improvement than a ranking based on student test scores.

The state has been rolling the new guides out slowly — both in a new quality rating system for early childhood education and in public-school districts.

By the fall of this year, all K-12 principals must, under state law, judge teachers using one of three guides, which are all based on teaching practices with a track record for effectiveness.

In the early education world, the state is using a fourth tool, called CLASS, which is backed by some of the deepest and highest quality research to date showing that it can improve both teaching and student achievement.

“We’ve now done four different experimental trials for this from preschool to high school and it’s worked every single time to improve the quality of interactions of the classroom and improve children’s learning,” said CLASS co-creator Robert Pianta, dean of the University of Virginia School of Education.

Focus on interactions

More than 20 years ago, Pianta and his colleagues set out to answer basic questions: What matters most in preschool and how can it be measured reliably?

Preschools traditionally had been judged by whether their electrical outlets had safety plugs and other stuff that was easy to count, like books on the shelves and blocks in the bins. Teaching quality? Not so much.

“Up until that point, there were observations done all the time, but they were observations of stuff in the room,” Pianta said. “There were not observations of what the teachers were doing. So the CLASS really stood that on its head.”

Elaine Jackson from the University of Washington observes an early childhood program using the CLASS, a research-based tool that measures interactions in early education settings. (Mike Siegel / The Seattle Times)

Pianta, a psychology professor and former seventh-grade teacher, focused on the way that children learn from their parents — their first and most important teachers.

He and his colleagues already knew from decades of research that strong parent-child bonds encourage children to venture from the safety of a warm lap to learn about their world.

Those moment-to-moment interactions between parents and children lay the foundations for skills that become even more important later, such as paying attention, persistence and self-control.

More from this story

Storytellers: Why I Teach

Join Education Lab, 88.5 KPLU and the University of Washington College of Education this Wednesday for an inspiring evening of stories about what it takes to become a great teacher. The event is free, but you must register in advance.

So they extended that idea to teacher-child interactions, categorizing them into three broad areas based on what research has shown to be key to children’s emotional, social and intellectual growth.

In the emotional-growth category, they listed interactions that make children feel welcome and safe, recognize their needs and respond to their interests.

In the second, they placed examples of good social instruction, in which teachers prevent or defuse misbehavior and orchestrate smooth transitions so kids aren’t standing around.

The third — which was hardest for most teachers — focused on instructional interactions that boost intellectual growth.

But evaluators do more than just count the number of open-ended questions a teacher asks or how often they connect concepts to something in a child’s everyday life.

“You can’t gauge it on the teachers’ behavior in isolation,” Pianta said. “You have to be able to watch the impact of the interaction on the child and then watch what the teacher does in response to the child.”

Once the CLASS researchers had their yardstick, they tested it in two large-scale studies and found that it was a good gauge of strong instruction.

They found that students entering kindergarten scored higher on tests of language and cognitive ability when their teachers had scored higher on CLASS.

Many other studies of CLASS — though not all — have found similar associations.

The team has since developed similar tools for elementary, middle and high schools — although not soon enough to be included among Washington’s approved guides.

One surprising finding: In the tens of thousands of classrooms that CLASS researchers have observed, they see the same pattern — teachers score better on supporting emotional growth and managing behavior than they do on providing instruction.

CLASS researchers, in other words, have concluded that most teaching is mediocre or worse, with teachers doing most of the talking and students filling out work sheets.

The good news is that those interactions can be improved.

One of the strongest CLASS studies showed that middle- and high-school teachers who were randomly assigned to a yearlong, CLASS-based training program appeared to teach better the following year.

The 2011 study, published in the journal Science, found that the students of the CLASS-trained teachers scored higher on math, English, science and history exams than the students of teachers who received the district’s typical in-service training.

The difference was about 10 percentage points.

“We do actually know what works and it’s not just your personal style,” said Kellie Morrill, an administrator at Educare School of Greater Seattle, a highly rated program that has used CLASS for years.

One of Educare’s most veteran teachers, Judy Somerville, has seen decades of Ivory Tower fads come and go, with teachers rolling their eyes and muttering “this too shall pass.”

“CLASS isn’t one of those things,” Somerville said. “It gives us real concrete information about where to put our interest and our focus.”

Kellie Morrill, campus director for Educare of Greater Seattle, explains how the program incorporates CLASS and observation into teacher training and professional development initiatives. (Lauren Frohne / The Seattle Times)

Not a cure-all

In practice, this is how a CLASS evaluation works:

Observers like Jackson are trained in how to identify strong instruction so they can rate it accurately and consistently.

Then they go into classrooms, where they typically watch a teacher for one or two hours. They observe for 20 minutes at a time, then take about 10 minutes to rate the interactions they’ve noted before starting a new cycle.

They focus on everything teachers do and say, whether it’s asking kids questions about the block tower they’re building or simply giving an upset child a hug.

Jackson remembers one teacher who leaned up against a window sill with her arms crossed while the children played.

“My score sheet was almost blank because there just weren’t any words, any interactions,” Jackson said.

At Tiny Tots in Renton, on the other hand, Jackson noted approvingly how Amin, the teacher, talked with her charges, three boys from ages 2 to 3, while they slurped Kix cereal.

Amin leaned in while one boy, Liam, whispered that he saw a big tree outside the window.

“What else?” she asked.

“A car,” Liam whispered.

“A car. Oooh. What kind of car do you see outside?” she asked.

That’s the kind of serve-and-return exchange that typically receives high marks.

Yet for all its strengths, CLASS is not a cure-all. Sometimes observers miss something, for example, or catch teachers on a bad day.

University of Washington professor Gail Joseph, who works closely with Pianta’s group and helped develop the state’s quality rating system for preschools and child-care centers, said she and her colleagues are still figuring out how best to use CLASS in Washington state.

Joseph and her colleagues are in the middle of a statewide study that is looking at CLASS in a variety of settings, including classrooms with children learning English, where CLASS evaluations may work better later in the year than in the beginning.

And University of Florida researchers say that CLASS doesn’t measure the kind of highly structured teaching that some students with disabilities need.

A large teacher-evaluation study funded by the Bill & Melinda Gates Foundation concluded that observational guides like CLASS work best when teachers are observed multiple times by multiple observers, which is costly.

Dan Goldhaber, a professor at UW Bothell who specializes in evaluating teacher performance, worries that the K-12 guides Washington school districts are using won’t be useful because they may not pick up differences in teaching quality.

Early results from the eight pilot districts in Washington state trying out the new guides showed that no teacher was rated unsatisfactory.

“Unless your judgment is that all teachers are equally successful, you’ve got to have your measure of teacher performance show differentiation,” Goldhaber said.

Struggling teachers were deliberately excluded from the pilot program, which researchers say skews the results.

But the initial ratings may also reflect principals’ inexperience using the new tools, said Stephen Fink, executive director of the UW Center for Educational Leadership, which developed one of the three systems that K-12 public schools must use.

“People say that teaching isn’t rocket science and I would argue that it’s actually more complicated than rocket science,” Fink said.

Funding at stake?

The stakes for using the tools correctly are high.

The federal government uses CLASS to rate Head Start preschools and programs, and the ones that score very low risk losing their contracts.

Washington state is considering restricting state subsidies for preschool and child care to programs that earn at least 3 out of 5 stars from the Early Achievers quality rating system, in which CLASS scores weigh heavily. So is the city of Seattle.

Most Read Stories

Unlimited Digital Access. $1 for 4 weeks.

Of the three other systems that public-school districts are using — mostly one developed by former teacher and economist Charlotte Danielson and the one developed by Fink’s group at the UW — none have been tested as widely as CLASS.

But next school year, the U.S. Department of Education will put the UW’s tool to the test in 100 elementary schools across the country, with principals in half the schools randomly assigned to receive coaching from the UW’s Center for Educational Leadership and the other half serving as a control group.

Researchers want to know if the training causes improvements in student achievement.

Meanwhile, the Anacortes School District, which has worked closely with UW to test the UW evaluation system, has some anecdotal evidence that students are doing better, said Cindy Simonsen, director of learning and instruction.

The biggest immediate benefit, she said, is that principals and teachers are better able to talk about teaching using the guide as a common reference.

Principals always have been able to fire bad teachers, she said, but until now haven’t had much to say to a teacher who was satisfactory but wanted to do better.

“We have a lot of teachers who want to be above average,” Simonsen said.

And Amin at Happy Tots Childcare?

She says participating in the Early Achiever program — which provides free coaching to teachers preparing for the in-person CLASS evaluation — improved her teaching.

Last week, she received her rating — 3 out of 5 stars, which the state considers excellent, with room to grow.

Correction: The original version of this story, published Feb. 22, has been corrected. It incorrectly implied that University of Washington researcher Gail Joseph had reached a conclusion about the suitability of the CLASS evaluation tool to rate interactions between teachers and children learning English as a second language.  Joseph and her colleagues are in the middle of a statewide study that is looking at CLASS in a variety of settings, including classrooms with children learning English.