top of page

Value-Added: A Tale of Two Methods

We've seen in our Growth segment that value-added measures the difference between actual and expected growth.  Take a look if you haven't yet!

 

The error depends on the accuracy of the growth expectations.  To compute growth expectations, there are two high-level approaches: use an independent benchmark and adjust for influencing factors, or estimate based on prior performance

 

A gain-score method starts with an independent benchmark and explicitly adjusts for factors such as poverty.  The SAS EVAAS method used by the ODE is a mixed method that assumes, in part, that a student will grow similarly to previous years. 

​

The primary goal is to reduce the amount of error due to various un-modeled causes.  In other words, don't blame the teacher for factors outside of his or her control. 

​

There's a second, overlooked goal.  Can you guess what it is?  Turns out the prior-performance method is inherently unfair with respect to this goal.

Thought Exercise

Let's investigate these two approaches using a case study: Running! 

 

Running has enjoyed a surge of popularity in the past decade.  Ads for sneakers, fancy pants, and gourmet energy bars are everywhere.  Neighbors jog by on a regular basis and 5ks are more popular than new crayons in kindergarten.   

​

And yet, I admit:  I am not a runner.  I love fitness of the 90's:  step aerobics and rollerblading.  Yeah, remember those?  Good times!  So, my running performance is... slow.  My neighbor, on the other hand, is out there jogging like a champion, rocking to tunes and tracking everything with her smartwatch. 

​

Could I be her?  Let's see, with this value-added experiment.  Imagine we both hire personal trainers and want to judge the effectiveness of our trainers.  We train each week and measure our mile time at the end of the week.  Here are the results.

​

​

Well, I didn't improve much.  Let's face it - I really didn't try.  In comparison, Jill improved substantially.

​

Now let's calculate the value-added scores of our trainers using estimates on prior performance, similar to the ODE's current method.  We'll use:

​

expected growth = sum of previous growth / number of observations

value-added = actual growth - expected growth

​

Let's assign grades of

A: +5 and higher,

B: < +5 to 0,

C: < 0 to -5,

D: < -5 to -10,

F:  < -10 or lower.

​

Here are the results for my trainer:

​

And here are the results for Jill's trainer:

Whoa!  My trainer aced value-added, and Jill's looks like a flop?  How can that be when my growth was awful?  This points out a second, overlooked goal:

​

Value-added must give teachers enough credit.  The prior-performance method doesn't.

​

Clearly, Jill's trainer deserves some credit for her excellent improvement.  And clearly, my performance could have been better.  There's no good reason for my pitiful progress. This points out an additional conclusion:

​

The prior-performance method always sets low expectations for low performers.

​

This is a tragedy.  The prior-performance method lowers the bar for kids with low growth in the past, without requiring any reason whatsoever.  Intuitively, it postulates that low performers are inherently "dumb" and always will be, or they have so many learning gaps they'll never catch up.  That's unjust.

​

Because of these two outcomes, we see that the prior-performance method doesn't work when a student has all excellent teachers, or all poor teachers, throughout the school years.  It will give a value-added score of 0 in both cases.

 

We don't want to soft bigotry of low expectations ingrained in our evaluation system.  Let's try again using the gain-score method.  Turns out, setting an independent benchmark addresses both of these problems.

 

For the running example, what could influence the expected growth?  Perhaps factors like age and body mass index.  Also, expected growth will depend on current performance for fast runners.  (If you're running a 4:30 min/mile, you're not really expected to improve to a 4:00 min/mile in a week.  Unless you're Usain Bolt).

 

expected growth = function of (age, BMI, current performance)

 

Turns out, Jill and I are the same age and BMI.  And we're not so fast that we're hitting the upper limits of human speed.  Therefore - we have the same expected growth.  Let's say it's 15 sec/week.

Look at the value-added scores now.  They show that, given the external factors, Jill's trainer performed well and my trainer lagged.  That's pointing us in the right direction.

 

So there you have it:  Gain-score for the win.  Specifically, I'm advocating a random effects model as discussed in this survey of value-added models. 

 

Now, this post has been kind of down on Dr. Sanders, and I'd like to step back and say thanks for his efforts to create a fair evaluation method.  He was truly trying to get the most accurate estimate of expected growth. 

 

Unfortunately, that turned out to be previous growth, which is bad news.  Can you figure out why?  It's OK for students growing on track.  Their growth rate should be around one year each year.  But it means our struggling students aren't catching up.  An average value added score for them, with the prior-performance method, only means they've cleared an unacceptably low bar, year... after year... after year.  It's accurate because we're consistently failing them.

​

How can we learn to do better?  Tune in to our next post!

Recommendations

1. Replace the prior-performance method with a gain-score method (independent goal and factors)

bottom of page