Year 6 Sample Test data – another update

As people continue to share their data via the spreadsheets, I thought it was about time to do one final update of summarised data based on the most recent shared results.

As we’re so close to the tests now, I have looked only at the data collected from tests taken since February half-term. That keeps it as a good sample size of 3500-4000 pupils in each case, while discounting the much older results. I don’t have enough data yet from Summer 1 alone to make much of it.

Key Stage 2 DfE Sample tests

Subject Mean Average Score Median Score Interquartile range
Reading 30 marks 31 marks 24 – 37 marks
Grammar, Punctuation & Spelling 36 marks 37 marks 28 – 46 marks
Mathematics 64 marks 65 marks 46 – 85 marks

The shifts are not that significant in reading and GPS (up by a mark each for the averages), but the maths average seems to have risen quite a bit (it was 59 marks just a month ago)

If you’ve recently used the tests in your school, please do add your data to the collection by entering into the spreadsheet here: Year 6 Sample Test data collection

You can access a spreadsheet to present your own school’s results in a comparison table and graphs alongside all of the national data collected so far (since January). Access the comparison spreadsheet here: Key Stage 2 Comparisons

The problems with the Interim Assessment Framework

Earlier this evening I posted a poll on Twitter:

The high “yes” response from primary teachers was little surprise to me, despite my having spent a good amount of time over the last couple of years trying to persuade people otherwise.

One thing that sets us apart from secondary colleagues, I think, is the introduction of the Interim Assessment Frameworks for Year 2 and Year 6. Where teachers might otherwise have felt excited about the prospect of moving away from levels, these frameworks have undone any positivity about the shift, as they present all the age-old issues with levels and then some more on top.

I have written briefly about this in the past, but let me set it out in more detail:


One of the issues with levels was that children of very different capabilities were categorised into a single group of “level 3”, or somesuch band. As a result, there was a huge focus on pupils near a threshold, a rush to push pupils through minimal content required to move into the next band, and a wide variety of information lost as it was translated into a single number.
All of these problems are replicated when you look at the new “Working Towards”, “Working At” and “Greater Depth” bands that replace them.


Daisy Christodoulou has written clearly on the problems of adverbs. While there are few adverbs in the new frameworks, we still have statements to judge such as whether pupils read “age-appropriate books”, or use “noun phrases effectively”. Though the exemplification materials try to illustrate some of these points, the subjectivity remains a significant issue.


One of the most challenging aspects of the new approach is the measure of ‘independence’. Having spent years hearing from secondary colleagues and the department what a nightmare Controlled Assessment was, we now have a virtually identical problem forced upon us at KS2, including the freedom for children to re-write work “after discussion with the teacher”. What nonsense that children should be able to spell “most words correctly (year 5 and 6)”  – whatever that might mean – but that at the same time they are permitted to use dictionaries and word banks in their efforts to do so.


All of these issues are bad enough, but when added to the content it gets ridiculous. Many will be aware of the nonsense of expecting seven-year-olds to write like Enid Blyton characters when using exclamatory sentences. Cue teachers up and down the country recounting Little Red Riding Hood at length.
The situation is replicated at Key Stage 2, where teachers nationwide are now finding ways to force barely competent writers to use hyphens and dashes in their work to provide evidence of meeting the expected standard.
To make matters worse, the Writing criteria at both stages barely mention the need for any purposeful composition or understanding of audience & purpose. They might as well have stuck with the results from the grammar test.
[Note that my criticism is not about the raising of expectations, but the narrowness of them]


If a teacher did exactly as directed by the frameworks, and checked every child’s work for evidence of each statement, they would find themselves making well over 1500 judgements per class. It simply isn’t manageable. And while in the past teachers could grasp a sense of “levelness” of work, now the need to ensure that every 11-year-old has used at least one hyphen in their writing has led to a burdensome approach.


One of the key aims of any assessment at the end of primary school ought to be to provide useful information to secondary colleagues. Yet the bandings used do little to support this. A pupil who can’t yet explain how fossils explain evolution would be deemed “not meeting the expected standard”. Little regard is given to their other wide-ranging knowledge that secondary teachers might draw upon.


We were repeatedly told that the old system of levels didn’t help parents to understand their children’s learning and needs. But the new framework makes that problem worse. After all, save for a few avid primary colleagues, who could reasonably put these descriptors in order, let alone know what each entails?


Could you put these in ascending order of ability?


Perhaps the most significant issue to explain the poll results is the pace at which change has been rushed through. Not just for teachers, but clearly for the department too. Deadlines have repeatedly been pushed back, and exemplification materials coming out just weeks before final judgements are due is unacceptable.
The shift between systems mid-key stage has also exacerbated problems.The change in expectations for the expected standard at a time when pupils were half-way through the Key Stage has been unhelpful and unfair. It has undoubtedly led to many teachers wishing no change had been made at all.

What should have happened?

Changing a whole assessment system is big work. To be done well it needs time and thought. If we were going to be stuck with an interim system, then better to have kept a development of levels in the interim. It would have been perfectly possible to add some pre-requisites to the existing level descriptors as a transition document while the finer detail was worked out. But better still, a roll-out of the new curriculum that saw the new tests introduced at the end of a full key-stage would have been much fairer and much easier to organise.

Now, as it stands, there is more damage to undo, and the poll at the top illustrates it only too well.