Why VAMs are Unreliable Measures for Evaluating Teachers

“Teachers who are in schools where there is a strong academic climate, where the peer culture is supporting academics, where the parents are supporting academics, where the teachers work together and learn from each other have a vastly easier time of it, than teachers who’s circumstances are the converse. And we can’t rely simply on some regression equation to iron all that out.” ~ Edward Haertal, Ph.D.

Screen Shot 2015-03-03 at 9.57.22 AM

The following is a transcript for a video interview with Edward Haertel, Ph.D., one of America’s leading psychometricians, at Stanford University. Prof. Haertal’s observations were shared by Diane Ravitch on her blog, in 2013. This is followed by links to lawsuits and further information on the widespread misuse of teacher evaluations.

“The first and most obvious flaw with value-added models for use as a tool for teacher evaluation, by which I mean application where we’re making consequential decisions about individuals based on data derived from their students test scores is the unreliability of these measures.

There is just way way too much noise, and this is a data problem, its not an analysis problem. It’s not that some more sophisticated model or some different choice of covariates or some better way of scaling or something is going to fix it.

There simply is a fundamental limitation because teacher effects, by which I mean the differences between highly effective and ineffective teachers simply don’t account for that much of the variation in students year to year test score gains. So you have to separate a weak signal from a lot of noise.

That’s the first problem. The next problem is that the models, in doing this, have to account for biases, systematic differences, not just random error, but the systematic error that tends to show up in the same direction, year after year, classroom after classroom, for a given teacher, either positive or negative.

Teachers who are in schools where there is a strong academic climate, where the peer culture is supporting academics, where the parents are supporting academics, where the teachers work together and learn from each other have a vastly easier time of it, than teachers who’s circumstances are the converse. And we can’t rely simply on some regression equation to iron all that out.

Thinking about the various things that go into a student’s test score, what determines how much students grow from one year to the next, what are the sources of variation in those gains, obviously the teacher is one important factor, and stronger teachers will bring about and encourage bigger gains than weaker teachers.

But in addition we have all of the out of school supports that some students enjoy, relative to others. We have the peer culture within the school, we have the academic learning climate. We have all the learning to learn that’s going on, to differing degrees for different students in years past. So it’s not just one teacher  this year who is responsible, its the whole instructional history and whole all out of school context as well as the school context that’s supporting the school learning during this time.

So, if it was just a matter of prior year differences, just a matter of the starting points where students began, that alone would be a challenge, because last year’s test scores only imperfectly measure that. But it’s far more than that, it’s all the things like students having different rates of learning, different trajectories this year because of those differences in support.” ~Prof. Edward Haertel, October 2013~

Links & Resources: * American Statistical Association : VAMs are Invalid & Unreliable Measures * Teacher Evaluation Should Not Rest on Student Test Scores (FairTest Fact Sheets) * Cuomo Plan Wrongly Pushes Test ObsessionDiane Ravitch – Why VAM Should Not Be Used to Grade Teachers * ASA Statement on Using Value-Added Models for Educational Assessment (full paper -pdf American Statistical Association) * Houston Federation of Teachers File Lawsuit Challenging Constitutionality of Value-Added Measure for Evaluations * Rochester Teacher’s Lawsuit – State Failed to Account for Impact of Poverty in Evaluations *  Cuomo Wants Test Scores to Account for 50% of Teacher Evaluation (Diane Ravitch, Jan. 2015) * Florida Teacher Lawsuit Could Spread to Other States *  Reliability and Validity of Inferences About Teachers Based on Student Test Scores by Edward H. Haertal (link to full report)  * Using Student Test Scores to Compare Teachers (Video of Lecture, Stanford University Graduate School of Education, October 2013) * The Witch Hunt Against Teachers * Facing Teacher Evaluation Deadline Charter Schools Just Say No *

“Edward H. Haertel is one of the nation’s premier psychometricians. He is Jacks Family Professor of Education Emeritus at Stanford University. I had the pleasure of serving with him on the National Assessment Governing Board, after I joined the board in 1997. He is wise, thoughtful, and deliberate. He understands the appropriate use and misuse of standardized testing. He was invited by the Educational Testing Service to deliver the 14th William H. Angoff Memorial Lecture, which was presented at ETS in March 21, 2013 and at the National Press Club on March 22, 2013. This lecture should be read by every educator and policymaker in the United States.” ~Diane Ravitch, Nov. 2013

“In the real world of schooling, students are sorted by background and achievement through patterns of residential segregation, and they may also be grouped or tracked within schools. Ignoring this fact is likely to result in penalizing teachers of low-performing students and favoring teachers of high-performing students, just because the teachers of low-performing students cannot go as fast… Simply put, the net result of these peer effects is that VAM will not simply reward or penalize teachers according to how well or poorly they teach. They will also reward or penalize teachers according to which students they teach and which schools they teach in.” ~Edward Haertel, 4th William H. Angoff Memorial Lecture, March, 2013

“Due to a faulty, incomprehensible and secret formula, good teachers like the ones filing this suit are being labeled failures and our entire education system is being reduced to a numbers game. Testing isn’t aligned with the purposes of public education. It doesn’t measure big-picture learning, critical thinking, resilience, creativity or curiosity, yet those are the qualities that great teaching brings out in a student. The fixation on testing has literally drained the joy out of learning. We’ve always been leery of value-added models, and we have enough evidence to make clear that not only has VAM not worked, it’s been really destructive and it in no way helps improve teaching and learning.” ~Randi Weingarten, American Federation of Teachers President, speaking about Houston Law Suit.


About Christopher Chase

Co-creator and Admin of the Facebook pages "Tao & Zen" "Art of Learning" & "Creative Systems Thinking." Majored in Studio Art at SUNY, Oneonta. Graduated in 1993 from the Child & Adolescent Development program at Stanford University's School of Education. Since 1994, have been teaching at Seinan Gakuin University, in Fukuoka, Japan.
This entry was posted in education reform and tagged , , . Bookmark the permalink.

9 Responses to Why VAMs are Unreliable Measures for Evaluating Teachers

  1. Pingback: Challenging the Cold War Pedagogy of Common Core | Creative by Nature

  2. Pingback: Flaws at the Heart of Current Education Reforms | Creative by Nature

  3. Pingback: Why Opting Out from PARCC is Important | Creative by Nature

  4. Pingback: Obama & Duncan Champion Test Abuse – Gerald Bracey | Creative by Nature

  5. Pingback: Fraud at the Heart of Current Education Reform | Creative by Nature

  6. Pingback: Did Former DOE Official Admit to Breaking U.S. Law? | Creative by Nature

  7. Pingback: The Art of Machiavellian Eduction Reform | Creative by Nature

  8. Pingback: The Art of Machiavellian Education Reform | Creative by Nature

  9. Pingback: Factory Model Education “Reforms” Were Designed for Product Testing, Not Children | Creative by Nature

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s