Performance Index

Do you know people who do the minimum? Someone who stops at lowest expectations with nothing more? In academics (and the real world), this is a recipe for long-term under-achievement. "Just passing the test" doesn't lead to enthusiasm for learning. What we really want is for students to develop the capability to be independent, life-long learners.

A performance index is designed to reward efforts at all levels by all students. The simplest form of a performance index is a mean score:

(Sum of scores) / (number of students)

We see that every student's score influences the mean. If a student does well, the mean score increases. If a student does poorly, the mean decreases. The mean score gives an overall perspective of everyone's achievement. It's a rough benchmark of whether a school is doing well or needs improvement.

The current Ohio Department of Education performance index is a modified mean score, where each individual score is converted to one of seven categories.

Thought Exercise

Here, we'll see that suggestions for improvement depend on the distribution. Here are some distributions with the same mean score (70). What would you advise each district?

There's one final question. Should the performance index penalize school districts for low-achieving students, and if so, by how much?

Would you assign these districts the same performance index score or different performance index scores? Scroll to the bottom to see the scores assigned by the current ODE formula.

Recommendations

1. Show the score distribution.

2. Allow below-grade-level testing. (In addition to already-allowed above-level testing).

3. Clearly show the size of the low-performance penalty.

4. Extra credit: Use an exponential function to retain precision and flexibility.

Sum of ((students per category / number of students) * points per category)

There are seven categories of achievement: Limited to Advanced, Advanced Plus (we'll discuss this), and Test Not Taken. Points are awarded for each category according to a weighting strategy. This allows giving different levels of "reward" to each category. In other words, students do not receive the same amount of "reward" for each increment of achievement. The effect of the chosen weighting strategy is to penalize schools for low achievers, as there's a large drop off between Proficient (weight 1.0) and the next level down, Basic (weight 0.6).

The documentation does not say the cutoff scores for each category, but we can guess from proposed House Bill 181 section 3301.0710 that the categories are in increments of 20. Letter grades are assigned by comparing the modified mean to a max score, and again, weighted to penalize schools for low achievers.

Hmm. We definitely want to look how many students fit into each achievement category - are most performing well? Are some being left behind? It's impossible to discern this from a single number. We can see that we're looking for...

The score distribution. A distribution shows how many students fall into each category. Just what we wanted to know!

Immediately a question arises. What about students performing above grade level? The ODE has a plan! Students taking above-level tests are bumped up into the next bracket, with the "Advanced Plus" category covering above-level "Advanced" scores.

A second question arises. What about students performing below grade level? Well, they could take the at-grade-level test. They'll probably perform horribly, be terribly upset, and mathematically come in around 0-25% (due to multiple choice guessing). And, unfortunately, a score that low tells nothing about what the student actually knows, which matters tremendously once we get to growth metrics. Can we do better?

Thought Exercise Scores

All of these distributions receive an A. Surprised?

We're going to assume 20 point increment brackets instead of 20-point-percentile brackets as in HB 181. Presumably the percentile is a proposed change from points. So, we have the brackets:

Limited: 0 to <= 20 score, 0.3 multiplier

Basic: > 20 to <= 40 points, 0.6 multiplier

Proficient: > 40 to <= 60 points, 1.0 multiplier

Accelerated: > 60 points to <= 80 points, 1.1 multiplier

Advanced: > 80 points to <= 100 points, 1.2 multiplier

There are no Advanced Plus scores in this example. Total is divided by 120.

Scores are: Expected (91.92%), Everyone's Different (90.42%), Most Low (90.92%), Most High (90.58%).

This reinforces the idea that looking at the score distribution is extremely helpful for improvement advice, since the performance index scores are similar despite large differences in the underlying distribution.

It's worth noting that untested students can significantly impact results due to the assignment of a 0 score. There were no untested students in this example.

Want to try out some different scenarios? Here's the sample data.

Yes. We could allow students to take below-grade-level tests. Now, for achievement, we'll probably still have to assign a 0-25%, since that's an accurate portrait of the student's grade level knowledge. So, the performance index score won't change much. However, the growth score will see a huge benefit. If we can accurately assess where a student is today, the growth score will reward schools that help students the most. That's a topic for another day.

Mathematically, there's a second consequence of categorization. Categorizing scores before computing the mean actually lowers the precision of the mean score, since we're losing information. It's like asking how high you walked in stair-steps instead of inches. We'll talk about how to retain precision (hint: an exponential function) in a future post.

Measure Up Ohio

State Report Cards

You'll Love