If you buy half a dozen medium size eggs anywhere in the country you’ll find they each weigh between 53 to 63 grams so it may come as a surprise that the standardised results from 11+ tests depends on who’s doing the standardisation and the results are about as consistent as a British seaside holiday. This is significant because a key tenet of admissions law is that only children who are of grammar school standard may be admitted to grammar schools.
Just like the eggs, test results are normally distributed meaning if you were to measure a whole bunch of them and plot the results on a graph you’d get the familiar bell curve. Both with eggs and test results, most measurements are clustered around the average (mean) with fewer the further you get from the mean. As any GCSE Stats student should be able to tell you, a standard score provides an individual’s position on that bell curve relative to the population. That word population is key. Both the major 11+ test providers, CEM and GL, stress how important it is for the population to be as representative as possible but when it comes to 11+ tests don’t follow their own advice.
To calculate a standard score you need to know; the individual score normally represented by x, the mean or average represented by μ and the standard deviation represented by σ using this simple formula:
This is best illustrated using a real example of an 11+ test consisting of 50 multiple choice questions. Given the usual strategy of guessing unknown answers the scores in this test ranged from 7 to 49 but with most candidates scoring near the middle of this range which had a mean value of μ = 28.04
Standard deviation sounds complex but just a measure of how much the scores vary. Think of it as the average amount by which all the scores vary from the average. If for some reason everyone scored the same, maybe there was only one candidate, maybe the questions were so easy everyone got them all right, the standard deviation would be zero. The standard deviation found in the mass of eggs at your local Tesco might be a few grams whilst for the icebergs in Baffin Bay it’s more likely to be measured in tonnes but using standard deviations allows statisticians to ask questions like, “Is this individual iceberg/egg particularly big/small when compared to the others in the population?” Note again the word population. In this 11+ test the standard deviation was σ = 9.60 so putting that all together if an individual correctly answered 36 questions their standard score (z) would be:
There is one last step however in that this answer is expressed in standard deviations above or below the mean. Statisticians like working in standard deviations but for all those candidates who scored below average this would be a negative number and so psychometric tests are, by convention, adjusted so the mean is 100 and each standard deviation is 15. In this example the final standardised score is therefore:
This would then be rounded to the nearest integer 112.
The useful thing about normal distributions is if you want to select just the largest quarter of eggs or icebergs from a population selecting those above +2/3σ (or 110 in 11+ terms) would give you precisely 25.14% but the Maths itself doesn’t lay down any rules about choosing the reference population. One grammar school even standardised the results of just six candidates who sat a late test against each other! Because this process picks a proportion regardless of the absolute values two of these candidates were found to be of grammar school standard even though they correctly answered about 20 questions less than the main cohort.
So why can’t parents be given the actual scores?
Two years ago I asked CEM for anonymised raw 11+ marks. They refused saying releasing this information would enable tutors to undermine their tutor proof tests. I referred this to the Information Commissioner along with evidence from Bucks showing that, if anything, CEM tests were actually less tutor proof. He dismissed the Bucks data as irrelevant saying it was sufficient for CEM to demonstrating that they were profiting from the claim and didn’t need to prove it was valid. CEM told the Commissioner they were earning in excess of £1m per year from tests on the basis that these were tutor proof.
I appealed to the First Tier Tribunal which was split over the issue. The two lay members agreed with the Information Commissioner although Judge Hamilton, for the minority, “strongly doubted that the exemption was engaged and even if it was [said I] had provided sufficient material and evidence for the public interest to tilt in favour of disclosure.”
I’d asked for the test scores of the six grammar schools in Reading and Slough from 2014 because there is a widespread feeling that grammar school standard is higher in Reading than Slough but this can’t be confirmed without the raw test scores. Slough is close enough to Reading for parents to consider sending children to the Reading grammars as a first choice whilst having Slough grammars which reserve places for local children as a fall back. Reading’s super selective grammars give no preference to local families and standardise applicants’ scores against the population of those who apply. In trying to explain this to the Schools Adjudicator, Kendrick School in Reading told him their scores were locally standardised. Surely the shortest oxymoron in the English language!
The court’s decision to put CEM’s commercial interests before transparency in grammar school places are allocated has far reaching consequences. Releasing this information would have paved the way for other requests which would have exposed the current secretive process. Given the raw scores, even parents who can’t see how standardised scores are calculated would want to ask why a child scoring less than their own is considered of grammar school standard when their child isn’t.
In their ruling the court repeatedly argued that releasing the information I requested was not in the public interest because I had not requested more of it but 2014 was the first year that Reading and Slough grammars started using CEM’s new tutor proof test. This creates a bizarre paradox as rather like Lewis Carroll’s White Queen offering Alice jam yesterday and jam tomorrow it’s not possible to ask for more information until you have some in the first place.
In summary a standardised score compares an individual to a population but as long as schools are comparing candidates to different populations then calling them standardised is positively misleading but very convenient for those schools that want to select the highest attaining pupils whilst ensuring the public remain ignorant of just how selective they’re actually being.