Why should barley inform our analysis of test data?

At the end of the month, DESE will release results of last year’s MAP test to the public. Folks who see themselves as logical, bottom line thinkers will try to make judgements based on the data. Which school, teacher, method is most effective? Some of this will even be done by educators, under the banner of “data driven decision making.” Very little of these judgements should be taken seriously. The simplistic thinking is demonstrated in the equation most folks will use.

Proficiency rate of school/teacher/method A > = < proficiency rate of school/teacher/method/ B

Anyone with a marginal background in statistical analysis will see flaws here. There’s no accounting for sample sizes or variables which might impact learning.

Consider William Sealy Gosset who invented the T Test in 1908. Gosset was serious about statistical analysis. After all, he was involved in important work: brewing beer for Guinness. He needed to make decisions about the quality of Barley A compared to Barley B. Because he took statistical analysis seriously, he wrote an equation that factored in sample sizes and accounted for accidental variations, while assuming variables were held mostly constant. You may not be familiar with the T Test, but it is widely used in finance, science, and academic research. The formula is a bit more sophisticated than
A > = < B.

m- μ

t = s/√n
Gosset's T test formula

Gosset realized some variance would naturally occur so he took the exercise a step further. He set thresholds to determine if the findings were significant, not incidental. Nowadays we declare victory or defeat over any difference in test scores.

The purpose of state testing is not to compare teachers, districts, or methods. The purpose of state testing is to compare a student’s performance on a test to accepted criteria which demonstrates proficiency or mastery. One problem Missouri educators face is the legislature changed the standards for evaluation multiple times, and DESE changed test formats and the criteria for mastery. The data returned to parents and educators has no value for comparative analysis. To be honest, the purpose of state testing is primarily to comply with federal legislation that drives funding.

Pretend we sought to compare schools, teachers, or methods through rigorous statistical methodology. Pretend we took the task as seriously as William Gosset took beer. Let’s say we sat down with Gosset and discussed our plan over a cup of coffee, or something.

Us: We want to use statistics to determine which schools, teachers, and methods work best.

Gosset: Tell me more.

Us: Well, we’re going to give a test and see which group has the most kids who pass.

Gosset: How do you plan to account for sample size variances?

Us: Not going to. We’ll treat pass rates of a 4th grade class of 10 the same as we do as the pass rates for all 4,000 4th graders in a large district.

Gosset: I see. How will you hold variables constant and account for variances that might naturally occur by accident?

Us: We’re not doing anything about that either, though we admit it’s a limitation.

Gosset: I see. Are you at least using the same instrument for measurements?

Us: Not really. We’ve changed the test instrument every year, but it’s sort of the same.

Gosset: Yeah, I don’t think this will produce valuable data.

Us: Really, I think we’ll use it for scathing editorials in the paper, bullet points in faculty meetings, and for a Great School Rating which helps determine the market value of real estate on Zillow.

Gosset: All of those are terrible applications of what you are suggesting. I really must be going.

I'm not out to write a take down of standardized testing. Educators know results are important to parents and policy makers. We all want our students to excel on the tests. Test results can help educators, working collaboratively in settings that respect the voices of teachers, triangulate data to identify trends, strengths, and weakness. I wouldn't advocate to throw out testing completely, but we do need to throw out the hyperbole in which arm chair statisticians occasionally indulge. There are limits to the usefulness of test data and the example of Gosset, the brew master, helps us see them.

mrtlentz blogs

Why should barley inform our analysis of test data?

0 comments:

Post a Comment

Pages

Categories

Recent Posts

Blog Archive

mrtlentz blogs

Why should barley inform our analysis of test data?

0 comments:

Post a Comment

Pages

Categories

Recent Posts

Blog Archive

Subscribe To