Category Archives: assessment

KS2 Writing: Moderated & Unmoderated Results

After the chaos of last year’s writing assessment arrangements, there have been many questions hanging over the results, one of which has been the difference between the results of schools which had their judgements moderated, and those which did not.

When the question was first raised, I was doubtful that it would show much difference. Indeed, back in July when questioned about it, I said as much:

At the time, I was of the view that LAs each trained teachers in their own authorities about how to apply the interim frameworks, and so most teachers within an LA would be working to the same expectations. As a result, while variations between LAs were to be expected (and clearly emerged), the variation within each authority should be less.

At a national level, it seems that the difference is relatively small. Having submitted Freedom of Information Requests to 151 Local Authorities in England, I now have responses from all but one of them. Among those results, the differences are around 3-4 percentage points:

moderated

Now, these results are not negligible, but it is worth bearing in mind that Local Authorities deliberately select schools for moderation based on their knowledge of them, so it may be reasonable to presume that a larger number of lower-attaining schools might form part of the moderated group.

The detail that has surprised me is the variation between authorities in the consistency of their results. Some Local Authority areas have substantial differences between the moderated and unmoderated schools. As Helen Ward has reported in her TES article this week, the large majority of authorities have results which were lower in moderated schools. Indeed, in 11 authorities, the difference is 10 or more percentage points for pupils working at the Expected Standard. By contrast, in a small number, it seems that moderated schools have ended up with higher results than their unmoderated neighbours.

What can we learn from this? Probably not a great deal that we didn’t already know. It’s hard to blame the Local Authorities: they can’t be responsible for the judgements made in schools they haven’t visited, and nor is it their fault that we were all left with such an unclear and unhelpful assessment system. All this data highlights is the chaos we all suffered – and may well suffer again in 2017.

To see how your Local Authority results compare, view the full table* of data here. It shows the proportions of pupils across the LA who were judged as working at the Expected and Greater Depth Standards in both moderated and unmoderated schools.


*Liverpool local authority claimed a right not to release their data on the grounds of commercial sensitivity, which I am appealing. I fully expect this to be released in due course and for it to be added here.

Some thoughts on the Primary Assessment Consultation

Pub Quiz question for the future: In what year did the primary assessment framework last not change? (Answers on a postcard, folks)

I may not always be the most complimentary about the DfE, but today I feel like there is a lot to praise in the new consultation on primary assessment. They have clearly listened to the profession, including the work undertaken by the NAHT assessment review, and have made some sensible suggestions for the future of primary assessment. As ever, I urge people to read the consultation, and respond over the next 12 weeks. Here, I’ve just shared a few thoughts on some key bits:

Assessment in the Early Years

For years, I feel like Early Years practice was held up as a shining example of assessment, as we were all wowed by their post-it notes and online apps, and all the photographs they took. I was never overly keen on all the evidence-collating, and I’m pleased that we’ve begun to eschew it in the Key Stages. It’s pleasing, therefore, to see that while the department is happy to keep the (actually quite popular) Early Years Profile, it wants advice on how the burden of assessment can be reduced in the Early Years.

I’m also pleased to see the revival of the idea of a Reception baseline. Much damage was done by the chaotic trial of different systems in 2015, but the principle remains a sensible one to my mind. I would much rather see schools judged on progress across the whole of the primary phase. It’s also quite right that baseline data shouldn’t be routinely published at school or individual level. The consultation seems open to good advice on how best to manage its introduction (an approach which might have led to greater success with the first attempt!).

Key Stage 1

I wasn’t certain that we’d ever persuade the DfE to let go of a statutory assessment, but it seems that they’re open to the idea. I do think that the KS1 tests – and the teacher assessment that goes along with them – are a barrier to good progress through the primary years, and I’d welcome their abandonment. The availability of non-statutory tests seems a sensible approach, and I’m happy to see that the department will consider sampling as a way to gather useful information at a national level. Perhaps we might see that rolled out more widely in the long term.

I’d have rather seen them take the completely radical option of scrapping the statutory tests straight away, but I can see the rationale for keeping them until the baseline is in place. Unfortunately that means we’re stuck with the unreliable Teacher Assessment approach for the time being. (More of that to follow)

Key Stage 2

Of course it makes sense to scrap statutory Teacher Assessment of Reading and Maths. Nobody pays it any heed; it serves no purpose but adds to workload. I’d have preferred to see Science go the same way, but no such luck. At the very least, I hope there is a radical overhaul of the detail in the Science statements which are currently unmanageable (and hence clearly lead to junk data in the extreme!)

There is also some recognition in there that the current system of Teacher Assessment of Writing is failing. The shorter term solution seems to be a re-writing of the interim frameworks to make them suit a best-fit model, which is, I suppose, an improvement. Longer term, the department is keen to investigate alternative (better) models; I imagine they’ll be looking closely at the trial of Comparative Judgement at www.sharingstandards.com this year. I’m less persuaded by the trial of peer-moderation, as I can’t quite see how you could ensure that a fair selection of examples are moderated. My experience of most inter-school moderation is that few discussions are had about real borderline cases, as few teachers want to take such risks when working with unfamiliar colleagues. Perhaps this trial will persuade me otherwise?

On the matter of the multiplication check, I don’t share the opposition to it that many others do. I’ve no objection to a sensible, low-stakes, no-accountability check being made available to support schools. I’d prefer to see it at the end of Year 4 – in line with the National Curriculum expectations, and I’d want to see more details of the trials, but overall, I can live with it.

Disappointments

Although it hardly gets mentioned, the opening statement that “it is right that the government sets a clear expected standard that pupils should attain by the end of primary school” suggests that the department is not willing to see the end of clunky descriptors like “Expected Standard”. That’s a shame, as the new scaled score system does that perfectly well without labelling in the same way. Hopefully future alternatives to the current Teacher Assessment frameworks might lessen the impact of such terminology.

Credit for whoever managed to get in the important fact that infant/junior and middle schools still exist. (Points deducted for failing to acknowledge first schools in the mix). However, the suggestions proposed are misguided. The consultation claims that,

the most logical measures for infant schools would be reception to key stage 1 and, for middle and junior schools, would be to continue with key stage 1 to key stage 2

While that may be true for infant, and potentially even junior schools, for middle schools this is a nonsense. Some middle schools only start from Year 6. How can it be sensible to judge their work on just 2 terms of a four-year key stage? The logical measure would require bespoke assessments on entry and exit. That would be expensive, so alternatives will be necessary. Personally I favour using just the Reception baseline and KS2 outcomes, along with sensible internal data for infant/first and junior/middle schools. The KS1 results have never been a helpful or reliable indicator.

Partly connected to that, I would also have liked to have seen a clearer commitment to the provision of a national assessment bank, as proposed by the Commission for Assessment without Levels, and supported by the NAHT review. It does get a brief mention in a footnote, so maybe there’s hope for it yet.

In Conclusion

Overall, I’m pleased with the broad shape of the consultation document. It does feel like a shift has happened within the department, and that there is a clear willingness to listen to the profession and correct earlier mistakes. There is as much positive news in the consultation as I might have hoped for.

If there were an interim assessment framework for judging DfE consultations, then this would have ticked nearly all of the boxes. Unfortunately, of course, nearly all is not enough, as any primary teacher knows, and so it must fall to WTS. Seems cruel, but he who lives by the sword…

Some clarity on KS2 Writing moderation … but not a lot

Not for the first time, the Department has decided to issue some clarification about the writing assessment framework at Key Stage 2 (and its moderation!). For some inexplicable reason, rather than sharing this clarity in writing, it has been produced as a slowly-worded video – as if it were us that were stupid!

Here’s my take on what it says:

Some Clarity – especially on punctuation

  • For Greater Depth, the long-winded bullet point about shifts in formality has to be seen in several pieces of work, with more than one shift within each of those pieces.
  • For Expected Standard, it is acceptable to have evidence of colons and semi-colons for introducing, and within, lists (i.e. not between clauses)
  • For Expected Standard, any of either brackets, dashes or commas are acceptable to show parenthesis. There is no need to show all three.
  • Bullet points are punctuation, but the DfE is pretending they’re not, so there’s no need to have evidence of them as part of the “full range” of punctuation needed for Greater Depth.
  • Three full stops to mark ellipsis are also punctuation, but again, the DfE has managed to redefine ellipsis in such a way that they’re not… so again, not needed for Greater Depth.

A bit of guidance on spelling

This was quite clear: if a teacher indicates that a spelling needs correcting by writing a comment in the margin on the relevant line, then the correction of that spelling cannot be counted as independent. If the comment to correct spellings comes at the end of a paragraph or whole piece, without specifying what to correct, then it can still count as independent.

No clarity whatsoever on ‘independence’

Believe me, I’ve re-watched this several times – and not all of them at double-speed – and I’m still bemused that they think this clarifies things. The whole debacle is still reliant on phrases like “over-scaffolding” and “over-detailed”. Of course, if things are over-detailed then there is too much detail. What isn’t any clearer is how much detail is too much detail. The video tells us that:

“success criteria would be considered over-detailed where the advice given directly shapes what pupils write by directing them to include specific words or phrases”

So we know specifying particular words is too much, but is it okay to use success criteria which include:

  • Use a varied range of sentence structures

Is it too specific to include this?

  • Use a varied range of sentence openers

What about…?

  • Use adverbs as sentence openers

There’s a wide gulf between the three examples above. Which of these is acceptable? Because if it’s the latter, then schools relying on the first will find themselves under-valuing work – and vice versa, of course. That’s before you even begin to consider the impossibility of telling what success criteria and other supporting examples are available in classrooms at the time of writing.

The video tries to help by adding:

“success criteria must not specifically direct pupils as to what to include or where to include something in their writing”

But all of those examples are telling children what to include – that’s the whole point of success criteria.

If I’ve understood correctly, I think all three of those examples are acceptable. But it shouldn’t matter what I think: if the whole system depends on what each of us thinks the guidance means, then the consistency necessary for fair and useful assessment is non-existent.

The whole issue remains a farce. Doubtless this year Writing results will rise, probably pushing them even higher above the results for the externally tested subjects. Doubtless results will vary widely across the country, with little or no relationship to success in the tested subjects. And doubtless moderation will be a haphazard affair with professionals doing their best to work within an incomprehensible framework.

And to think that people will lose their jobs over data that results from this nonsense!


The full video in all its 11-minute glory can be found at: https://www.youtube.com/watch?v=BQ-73l71hqQ

 

National Curriculum Test videos

I’ve updated the videos I made last year to explain the KS1 and KS2 tests to parents. As there is an option about using the Grammar, Punctuation & Spelling tests in primary schools, there are now two versions of the video for KS1 (one with, one without the GPS tests).

Please feel free to use these videos on your school’s website or social media channels, or in parent meetings, etc. There are MP4 versions available to download.

Key Stage 2

Re-tweetable version:

Facebook shareable version:
https://www.facebook.com/primarycurriculum/videos/1311921482187352/

Downloadable MP4 file: https://goo.gl/b0Lo9v

Key Stage 1 – version that includes the GPS tests

Re-tweetable version:

Facebook shareable version:
https://www.facebook.com/primarycurriculum/videos/1311921482187352/

Downloadable MP4 file: https://goo.gl/jo18qk

Key Stage 1 – version for schools not using the GPS tests

Re-tweetable version:

Facebook shareable version:
https://www.facebook.com/primarycurriculum/videos/1311921482187352/

Downloadable MP4 file:  https://goo.gl/xMDFSJ

The impossibility of Teacher Assessment

I’ve said for a fair while now that I’d like to see the end of statutory Teacher Assessment. It’s becoming a less unpopular thing to say, but I still don’t think it’s quite reached the point of popularity yet. But let me try, once again, to persuade you.

The current focus of my ire is the KS2 Writing assessment, partly because it’s the one I am most directly involved in (doing as a teacher, not designing the monstrosity!), and partly because it is the one with the highest stakes. But the issues are the same at KS1.

Firstly, let me be frank about this year’s KS2 Writing results: they’re nonsense! Almost to a man we all agreed last year that the expectations were too high; that the threshold was something closer to a Level 5 than a 4b; that the requirements for excessive grammatical features would lead to a negative impact on the quality of writing. And then somehow we ended up with 74% of children at the expected standard, more than in any other subject. It’s poppycock.

Some of that will be a result of intensive drilling, which won’t have improved writing that much. Some of it will be a result of a poor understanding of the frameworks, or accidental misuse of them. Some of it will be because of cheating. The real worry is that we hardly know which is which. And guidance released this year which is meant to make things clearer barely helps.

I carried out a poll over the last week asking people to consider various sets of success criteria and to decide whether they would be permitted under the new rules which state that

independent

So we need to decide what constitutes “over-aiding” pupils. At either end of the scale, that seems quite simple.Just short of 90% of responses (of 824) said that the following broad guidance would be fine:

1.png

Simplest criteria

Similarly, at the other extreme, 92% felt that the following ‘slow-writing’ type model would not fit within the definition of ‘independent’:

8

Slow writing approach

This is all very well, but in reality, few of us would use such criteria for assessed work. The grey area in the middle is where it becomes problematic. Take the following example:

5

The disputed middle ground

In this case results are a long way from agreement. 45% of responses said that it would be acceptable, 55% not. If half of schools eschew this level of detail and it is actually permitted, then their outcomes are bound to suffer. By contrast, if nearly half use it but it ought not be allowed, then perhaps their results will be inflated. Of course, a quarter of those schools maybe moderated which could lead to even those schools with over-generous interpretations of the rules suffering. There is no consistency here at all.

The STA will do their best to temper these issues, but I really think they are insurmountable. At last week’s Rising Stars conference on the tests, John McRoberts of the STA was quoted as explaining where the line should be drawn:

That advice does appear to clarify things (such that it seems the 45% were probably right in the example above), but it is far from solving the problem. For the guidance is full of such vague statements. It’s clear that I ought not to be telling children to use the word “anxiously”, but is it okay to tell them to open with an adverb while also having a display on the wall listing appropriate adverbs – including anxiously? After all, the guidance does say that:

guidance.png

Would that count as independent? What if my classroom display contained useful phrases for opening sentences for the particular genre we were writing? Would that still be independent?

The same problems apply in many contexts. For spelling children are meant to be able to spell words from the Y5/6 list. Is it still okay if they have the list permanently printed on their desks? If they’re trained to use the words in every piece?

What about peer-editing, which is also permitted? Is it okay if I send my brightest speller around the room to edit children’s work with them. Is that ‘independent’?

For an assessment to be a fair comparison of pupils across the country, the conditions under which work is produced must be as close to identical as possible, yet this is clearly impossible in this case.

Moderation isn’t a solution

The temptation is to say that Teacher Assessment can be robust if combined with moderation. But again, the flaws are too obvious. For a start, the cost of moderating all schools is likely to be prohibitive. But even if it were possible, it’s clear that a moderator cannot tell everything about how a piece of work was produced. Of course moderators will be able to see if all pupils use the same structure or sentence openers. But they won’t know what was on my classroom displays while the children were writing the work. They won’t know how much time was spent on peer-editing work before it made the final book version. They won’t be able to see whether or not teachers have pointed out the need for corrections, or whether each child had been given their own key phrases to learn by heart. Moderation is only any good at comparing judgements of the work in front of you, not of the conditions in which it was produced.

That’s not to imply that cheating is widespread. Far from it: I’ve already demonstrated that a good proportion of people will be wrong in their interpretations of the guidance in good faith. The system is almost impossible to be any other way.

The stakes are too high now. Too much rests on those few precious numbers. And while in an ideal world that wouldn’t be the case, we cannot expect teachers to provide accurate, meaningful and fair comparisons, while also judging them and their schools on the numbers they produce in the process.

Surely it’s madness to think otherwise?


For the results of all eight samples of success criteria, see this document.

 

A consistent inconsistency

With thanks to my headteacher for inadvertently providing the blog title.

With Justine Greening’s announcement yesterday we discovered that the DfE has definitely understood that all is not rosy in the primary assessment garden. And yet, we find ourselves looking at two more years of the broken system before anything changes. My Twitter timeline today has been filled with people outraged at the fact that the “big announcement” turned out to be “no change”.

I understand the rage entirely. And I certainly don’t think I’ve been shy about criticising the department’s chaotic organisation of the test and errors made. But I’m also not ready to throw my toys out of the pram just yet. This might just be the first evidence that the department is really listening. Yes, perhaps too little too late. Yes, it would have been nice for it to have been accompanied by an acknowledgement that the problems were caused by the pace of change enforced by ministers. But maybe they’re learning that lesson?

For a start, there are many teachers nationally who are just glad of the consistency. As my headteacher said earlier today, it leaves us with a consistent inconsistency. But nevertheless, there will be many teachers who are relieved to see that the system is going to be familiar for the next couple of years.

It’s a desire I can understand, but just can’t go along with. There are too many problems with the current system – mostly those surrounding the Teacher Assessment frameworks and moderation. But I will hang fire, because there is the prospect of change on the horizon.

It’s tempting to see it as meaningless consultation, but until we see the detail I don’t want to rule anything out. I hope that the department is listening to advice, and is open to recommendations – including those which the NAHT Assessment Reform Group of which I am a member is drawing together over this term.

If the DfE listens to the profession, and in the spring consults on a meaningful reform that brings about sensible assessment and accountability processes, then we may eventually come to see yesterday’s announcement as the least bad of the available options.

Of course, if they mess it up again, I’ll be on their case.

The potential of Comparative Judgement in primary

I have made no secret of my loathing of the Interim Assessment Frameworks, and the chaos surrounding primary assessment of late. I’ve also been quite open about a far less popular viewpoint: that we should give up on statutory Teacher Assessment. The chaos of the 2016 moderation process and outcomes was an extreme case, but it’s quite clear that the system cannot work.

It’s crazy that schools can be responsible for deciding the scores on which they will be judged. It has horrible effects on reliability of that data, and also creates pressure which has an impact on the integrity of teachers’ and leaders’ decisions. What’s more, as much as we would like for our judgements to be considered as accurate, the evidence points to a sad truth: humans (including teachers) are fallible. As a result, Teacher Assessment judgements are biased – before we even take into account the pressures of needing the right results for the school. Tests tend to be more objective.

However, it’s also fair to say that tests have their limitations. I happen to think that the model of Reading and Maths tests is not unreasonable. True, there were problems with this year’s, but the basic principles seems sound to me, so long as we remember that the statutory tests are about the accountability cycle, not about formative information. But even here there is a gap: the old Writing test was scrapped because of its failings.

That’s where Comparative Judgement has a potential role to play. But there is some work to be done in the profession for it to find its right place. Firstly we have to be clear about a couple of things:

  1. Statutory Assessment at the end of Key Stages is – and indeed should be – separate from the rest of assessment that happens in the classroom
  2. What we do to judge work, and how we report that to pupils and parents are – and should be – separate things.

Comparative Judgement is based on the broad idea of comparing lots of pieces of work until you have essentially sorted them into a rank order. That doesn’t mean that individuals’ ranks need be reported, any more than we routinely report raw scores to pupils and parents. It does, though, offer the potential of moving away from the hideous tick-box approach of the Interim Frameworks.

Teachers are understandably concerned by the idea of ranking, but it’s really not that different from how we previously judged writing. Most experienced Y2/Y6 teachers didn’t spend hours poring over the level descriptors, but rather used their knowledge of what they considered L2/L4 to look like, and judged whether they were looking at work that was better or worse. Comparative Judgement simply formalises this process.

It particularly tackles the issue that is particularly prevalent with the current interim arrangements: excellent writing which scores poorly because of a lack of dashes or hyphens (and poor writing which scores highly because it’s littered with them!). If we really want good writing to be judged “in the round”, then we cannot rely on simplistic and narrow criteria. Rather, we have to look at work more holistically – and Comparative Judgement can achieve that.

Rather than teachers spending hours poring over tick-lists and building portfolios of evidence, we would simply submit a number of pieces of work towards the end of Year 6 and they would be compared to others nationally. If the DfE really wants to, once they had been ranked in order, they could apply scaled scores to the general pattern, so that pupils received a scaled score just like the tests for their writing. The difference would be that instead of collecting a few marks for punctuation, and a few for modal verbs, the whole score would be based on the overall effect of the piece of writing. Equally, the rankings could be turned into “bands” that matched pupils who were “Working Towards” or “Working at Greater Depth”. Frankly, we could choose quite what was reported to pupils and parents; the key point is that we would be more fairly comparing pupils based on how good they were at writing, rather than how good they were at ticking off features from a list.

There are still issues to be resolved, such as exactly what pieces of writing schools would submit for judgement, and the tricky issue of quite how independent the work should be. Equally, the system doesn’t lend itself as easily to teachers being able to use the information formatively – but then, aren’t we always saying that we don’t want teachers to teach to the tests?

Certainly if we want children’s writing to be judged based on its broad effectiveness, and for our schools to be compared fairly for how well we have developed good writers, then it strikes me that it’s a lot better than what we have at the moment.


Dr Chris Wheadon and his team are carrying out a pilot project to look at how effective moderation could be in Year 6. Schools can find out more, and sign up to join the pilot (at a cost) at: https://www.sharingstandards.com/

 

Getting started with FFT data for KS2

School leaders are used to dealing with change, not least when it comes to assessment data, but this year is in a league of its own. With changes to all the tests, teacher assessment, scaled scores and accountability measures, headteachers would be forgiven for despairing of any attempt to make sense of it.

Even when Raise becomes available, there’s no saying how easy it will be to interpret, not least because of all the changes this year. However, the FFT Summary Dashboard is available from today (Wednesday 14th), allowing you to make headway into that first stage of data analysis to evaluate your school’s strengths, and pick out areas for further development. In today’s climate, any help with that will be welcome!

The first glance of your dashboard will give you a very quick visual representation of your key headline figures – attainment and progress – related to those that will feature in performance tables and be published on your school website. In FFT these are represented in the form of comparison gauges:

gauges.png

Comparison gauges that show key figures at a glance

The beauty of this is the clarity they provide compared to the complexity of the published data and its confidence intervals. In short: the middle white zone shows that you’re broadly in line with national outcomes; the red and green bands at either end suggest significant lower or higher results. This will be particularly helpful for governors who are either shocked by changes in numbers from the old system, or who are concerned about small negative values on the progress measures.

 

The dashboard offers more clarity, too, about specific groups within your school. With a changing landscape it can be hard to know what to expect, but the pupil group analysis will quickly tell you which specific groups – girls, middle attainers, free school meals – have performed particularly well, and which seem not to be keeping up. It’s a simple overview that makes a good starting point for further investigation.

groups

Quick identification of groups that have done particularly well, or poorly (green plus symbols show significant values)

It’s worth remembering, though, that some groups may be very small in your school: if you’ve only got a handful of girls, then don’t get too worked up over variations!

The dashboard also helps to pick out trends over time – another challenge when all the goalposts seem to have moved. By comparing the national results to previous years, FFT have been able to plot a trajectory that compares how attainment and progress might have looked in 2014 and 2015 under the current system. As a result, you can begin to see whether your school has improved by comparison to the national picture.

time.png

The time series shows your previous results adjusted to bring them more closely into line with the new frameworks. Not perfect, but a very telling ‘starter for ten’!

A caveat here: this is much more difficult with the writing judgements which are much less precise than the scaled scores. Take that alongside the evident variation in writing outcomes this year, and you may want to look deeper into those figures before making any quick judgements.

vulgps

Groups analysed

Further into the summary dashboard itself, we get into the detail of vulnerable groups and of the separate subjects. Again, you get an overview that helps to pinpoint areas to look into further. Specific groups remain a clear focus for Ofsted and other inspections, so this information will be vital to leaders. The further breakdown of subjects will be of interest too, and of particular use in schools where writing has been affected by the national inconsistencies. Again these sections allow you to compare your attainment and progress to the national picture, and also to reflect on how your results may have changed over time.

No doubt, by the time school leaders and governors have begun to look at their summary overview, there will be many more questions asked. That’s where the FFT Aspire platform can help. Using your summary as a starting point, you can explore each element in greater detail, filtering your results for different groups, or subjects – even down to the level of individual pupils. It will help you to unpick the measures that are likely to feature on your Raise Online profile when it arrives, and with others too, including using contextual information about your pupils to compare to similar groups elsewhere.  Alongside the target-setting and other elements of FFT, you have a wealth of information at your fingertips that can be used to focus your school improvement planning – the summary dashboard is just the start.

 


This post was written with the support of FFT in preparation for the launch of the new dashboards on 14th September 2016.

Some thoughts on KS2 Progress

Caveats first: these conclusions, such as they are, are drawn from a small sample of a little over 50 schools. That sample of schools isn’t representative: indeed, it has slightly higher attainment than the national picture, both in terms of KS2 outcomes, and in KS1 starting points. However, with over 2000 pupils’ data, it shows some interesting initial patterns – particularly when comparing the three subject areas.

Firstly, on Maths – the least controversial of the three subjects. It seems that – in this sample – pupils who achieved Level 2c at KS1 had an approximately 40% chance of reaching the new expected standard (i.e. a scaled score of 100+). That leaps to around 66% for those achieving L2b at KS1 (i.e. just short of the national average)

mathslevels

The orange bar shows the average of this sample, which is slightly higher than the national average of 70%

It’s important to note, though, that progress measures will not be based on subject levels, but on the combined APS score at Key Stage 1. The graph for these comparisons follows a similar pattern, as you’d expect:

mathsaps

Where fewer than 10 pupils’ data was available for any given APS score, these have been omitted.

There is an interesting step here between pupils in this sample on APS of 13 (or less) who have a chance of 40% or less of reaching the expected standard, while those scoring 13.5 or more have a greater than 50% chance of achieving the standard. (The dip at 12.5 APS points relates to pupils who scored Level 2s in Maths and one English subject, but a level 1 in the other, highlighting the importance of good literacy for achievement in KS2 Maths)

For Reading, the graphs look broadly similar in shape

readinglevels

Blue bar shows average of this sample at 67%, which is slightly higher than national average of 66%

Interestingly here the level 2c children scorers still have only 40% chance of meeting the expected standard, but those achieving 2b have a lower chance than in maths of reaching the expected standard (58% compared to 66% for Maths).

When looking at the APS starting points, there is something of a plateau at the right-hand end of the graph. The numbers of pupils involved here are relatively few here (as few as 31 pupils in some columns). Interestingly, the dip at 18.5 APS points represents the smallest sample group shown, made up of pupils who scored 2a/2b in the two English subjects, but a Level 3 in Maths at Ks1. This will be of comfort to teachers who have been concerned about the negative effect of such patterns on progress measures: it seems likely that we will still be comparing like with like in this respect.

readingaps

It is in Writing that the differences become more notable – perhaps an artefact of the unusual use of Teacher Assessment to measure progress. Compared to just 40% of pupils attaining L2c in Reading or Maths achieving the new expected standard, some 50% of those in Writing managed to make the conversion, and this against a backdrop of teachers concerned that the expected standard was too high in English. Similarly, over 3/4 of those achieving Level 2b managed to reach the standard (cf 58% Reading, 66% Maths)

writinglevels

In contrast to the other subjects, attainment in this sample appears notably lower in Writing than the national average (at 70% compared to 74% nationally)

With the APS comparisons, there are again slight dips at certain APS points, including 18.5 and 19.5 points. In the latter case, this reflects the groups of pupils who achieved Level 3s in both Reading and Maths, but only a 2b in Writing at KS1, suggesting again that the progress measure does a good job of separating out different abilities, even using combined APS scores.

writingaps

Of course, this is all of interest (if you’re interested in such things), but the real progress measures will be based on the average score of each pupil with each KS1 APS score. I’d really like to collect some more data to try to get a more reliable estimate of those figures, so if you would be willing to contribute your school’s KS1 and KS2 data, please see my previous blog here.


Spread of data

Following a request in the comments, below, I’ve also attached a table showing the proportions of pupils achieving each scaled score for the two tests. This is now based on around 2800-2900 pupils, and again it’s important to note that this is not a representative sample.

proportions

A few words on the 65% floor standard

There’s been much discussion about this in the last few days, so I thought I’d summarise a few thoughts.

Firstly, many people seem to think that the government will be forced to review the use of a 65% floor standard in light of the fact that only 53% of pupils nationally met the combined requirements. In fact, I’d argue the opposite: the fact that so few schools exceed the attainment element of the floor standard is no bad thing. Indeed, I’d prefer it if no such attainment element existed.

There will be schools for whom reaching 65% combined Reading, Writing & Maths attainment did not require an inordinate amount of work – and won’t necessarily represent great progress. Why should those schools escape further scrutiny just because they had well-prepared intakes? Of course, there will be others who have met the standard through outstanding teaching and learning… but they will have great progress measures too. The 65% threshold is inherently unfair on those schools working with the most challenging intakes and has no good purpose.

That’s why I welcomed the new progress measures. Yes it’s technical, and yes it’s annoying that we won’t have it for another couple of months, but it is a fairer representation of how well a school has achieved in educating its pupils – regardless of their prior attainment.

That said, there will be schools fretting about their low combined Reading, Writing & Maths scores. I carried out a survey immediately after results were released, and so far 548 schools have responded, sharing their combined RWM scores. From that (entirely unscientific self-selecting) group, just 28% of schools had reached the 65% attainment threshold. And the spread of results is quite broad – including schools at both 0% and 100%.

The graph below shows the spread of results with each colour showing a band of 1/5th of schools in the survey. Half of schools fell between 44% and 66%.

Combined attainment

Click to see full-size version

As I said on the day the results were published – for a huge number of schools, the progress measure will become all important this year. And for that, we just have to wait.

Edit:

Since posting, a few people have quite rightly raised the issue of junior/middle schools, who have far less control over the KS1 judgements (and indeed in middle schools, don’t even have control over the whole Key Stage). There are significant issues here about the comparability of KS1 data between infant/first schools and through primary schools (although not necessarily with the obvious conclusions). I do think that it’s a real problem that needs addressing: but I don’t think that the attainment floor standard does anything to address it, so it’s a separate – albeit important – issue.