
For those of us on the front line of Key Stage 2 testing reform – children, parents and teachers – the drama is not yet over. We are in the lull between the tests and the results. But we should be worried. The results are going to be problematic.
Previously, children received a level score – most children received a 3, 4 or 5. Two similarly attaining children in the same class might not get the same level, but they might be consoled that “it was probably really close.”
We can’t say that this summer.
This year’s SATs results will be presented as a score, with 100 being the “expected standard.” If two children in a class score 92 and 108, it will be difficult to say, “It was really close.” But in terms of attainment, these children could be identical.
Testing isn’t 100% reliable: a reliable test would give approximately the same score to the same child over repeated testing. However, even a reliable test won’t give the same child exactly the same score.
Examiners measure test reliability on a 0-1 scale – 0 is totally unreliable (e.g. tossing a coin to see who is the best at something), and 1 is totally reliable. If a test scores 0.8, it is considered a reliable test.
More useful to teachers and parents is the ‘confidence band’ of a test result. If a child gets a score of 100, how confident are you that the child isn’t really 95 or 105? In a reliable test, you can be 90% sure that the test value your child received is within plus or minus 8. That means if your child scored 100, you can be 90% sure that he or she is really somewhere between 92 and 108.
Let’s assume the SATs tests were very reliable. On July 7th your child will get a score for reading and maths. 100 is the score defined by the government to be the expected score for the end of year 6. Your child may get higher or lower. If your child gets a score of 92, he could really be attaining 100. Or 86. If your child gets 108, she could really be attaining 100. Or 116.
The only thing your child is likely to do with their score this summer is to compare it to her friends’ scores. I cannot advise you on the best way to deal with this. Perhaps parents and teachers can explain that the confidence band is so wide that you really don’t know who is the better reader. If they understand you, they should be getting a pretty high score in maths.
The real purpose of the SATs is not to grade individual children (the tests cannot do that), but to grade primary schools. Nick Gibb acknowledges this (see image). Statistically, this is reasonable. The number of children randomly achieving above 100 is likely to be balanced my the number of children randomly achieving below.
However, the precision presented by the new score system is likely to prove damaging to individual children. I hope secondary schools are wary about setting using the scores. I hope parents will be cautious about sharing the scores with their children. I won’t be telling my daughter her score – I’d rather not know myself.