Category Archives: assessment

Primary Assessment changes… again!

First of all, let me say that I’m pleased that primary assessment is changing again, because it’s been a disaster in so many ways. So here is a summary of the changes at each key stage – with my thoughts about each.

Early Years Foundation Stage Profile

  • The EYFS Profile will stay, but will be updated to bring it into line with the new national curriculum and take account of current knowledge & research. I’ve never been a huge fan of the profile, but I know most EY practitioners have been, so that seems a sensible move.
  • A proposed change to reduce the number of reported Early Learning Goals to focus on prime areas and Literacy/Maths
  • The ’emerging’ band may be divided to offer greater clarity of information particularly for lower-attaining pupils.
  • An advisory panel will be set up to advise on changes to the profile and ELGs. Membership of that could be contentious

Reception Baseline

  • New Reception baseline to be introduced from 2020 (with proper trialling beforehand this time, one presumes!) to take place in the first 6 weeks of school.
  • Won’t be a ‘test’, but also won’t be observational over time. Suspect something more like the current CEM model, perhaps?
  • Will focus on literacy & numeracy, and potentially a ‘self-regulation’ element, as good predictors for attainment in KS2
  • Data won’t be used for any judgements about Reception, but will be used at cohort level to judge progress by the end of KS2.
  • The intention is for the assessment to provide some narrative formative information about children’s next steps.

Key Stage 1

  • The KS1 Grammar, Punctuation & Spelling test will remain optional.
  • Statutory Assessment will remain until at least 2023 (to allow for a year of overlap with the first cohort to be assessed using Reception baseline).
  • A new framework for Teacher Assessment of Writing has been published for this year only. Exemplification will follow this term.
  • DfE will continue to make assessments available (perhaps through an assessment bank if that ever gets off the ground!) after 2023, to help schools to benchmark attainment.
  • After 2023, tests and statutory teacher assessment will become optional for through primary schools.
  • There is more work to be done to find a system which works well for infant/junior and first/middle schools. This will be done with those in the sectors.

Key Stage 2

  • A multiplication check will be introduced at the end of Year 4. (Although, of course, whether the end means July or May remains to be seen).
  • School-level data on the multiplication check won’t be published.
  • This will be the last year that teachers have to make Teacher Assessment judgements for Reading and Maths
  • A new framework for Teacher Assessment of Writing has been published for this year only. Exemplification will follow this term.
  • DfE will continue to evaluate other options for the future, but not really committing to anything yet.
  • Small trials of peer-to-peer moderation will take place this summer.
  • Science Teacher Assessment frameworks will be updated next year.
  • The Reading test will not be timetabled for Monday of SATs week any more (hurrah!)
  • The DfE aims to link the reading content of the tests more closely to the curriculum to ensure children are drawing on their knowledge.

My thoughts

Overall, I’m pleased. Most of these changes are to be welcomed. The Reception baseline is a sensible idea (just a shame it was so badly implemented the first time round), as is scrapping KS1 assessments. The Early Years changes seem reasonable given the popularity of the current setup. The improvements to the KS2 Reading test are positive, as is the removal of pointless Teacher Assessment judgements.

On Writing, I fear we haven’t gone far enough. The current system is a joke, and it seems like the interim solution we’ll have to replace the old interim solution will just aim to make it less awful without really fixing the problem. It’s a shame that there is no obvious answer on the horizon. Perhaps the department has had its fingers burnt by rushing into quick fixes in the past and is prepared to bide its time.

In the interim, the updated expectations for Writing seem more manageable both in terms of achieving and assessing them. Of course, the devil is in the detail. If we get another exemplification book that breaks down single statements into several tick-boxes then we may be back at square one. Equally, of course, we can expect proportions of pupils meeting the expected standard to rise again substantially this year. Surely we have to be honest now and say that we really cannot use this data for accountability purposes? Mind you, perhaps it won’t matter – if we’re all getting 90% in Writing, it’ll only be the tested subjects that will make a difference to the accountability!

There are some other changes I would have liked to have seen. I really don’t think the “expected standard” label is helpful, particularly in subjects where scaled scores are used; it’s a shame we’ve not seen the back of that.

We’re not out of the woods yet. But we’re heading in the right direction, and credit is due to those at the department for listening. Let’s just hope they keep listening until we all get it right.

Advertisements

Will we see a leap in Writing attainment?

I’ve long been clear that I think that the current system of assessing writing at KS2 (and at KS1 for that matter) is so flawed as to be completely useless. The guidance on independence is so vague and open to interpretation and abuse, the framework so strictly applied (at least in theory), and moderation so ineffective at identifying any poor practice, that frankly you could make up your results by playing lottery numbers and nobody would be any the wiser.

One clear sign of its flaws last year was in the fact that having for years been the lowest-scoring area of attainment, and despite the new very stringent criteria which almost all teachers seem to dislike, somehow we ended up with more children achieving the expected standard in Writing than in any other subject area.

My fear now is that we will see that odd situation continue, as teachers get wise to the flaws in the framework and exploit them. I’m not arguing that teachers are cheating (although I’m sure some are), but rather that the system is so hopelessly constructed that the best a teacher can do for their pupils is to teach to the framework and ensure that every opportunity is provided for children to show the few skills required to reach the standard. There is no merit now in focusing on high quality writing; only in meeting the criteria. Results will rise, with no corresponding increase in the quality of writing needed.

For that reason, I suspect that we will likely see a substantial increase in the number of schools having more pupils reaching the expected standard. At Greater Depth level I suspect the picture will be more varied as different LAs give contradictory messages about how easy is should be to achieve, and different moderators appear to apply different expectations.

In an effort to get a sense of the direction of travel, I asked teachers  – via social media –  to share their writing data for last year, and their intended judgements for this year. Now, perhaps unsurprisingly, more teachers from schools with lower attainment last year have shared their data, so along with all the usual caveats of what a small sample this is, it’s worth noting that it’s certainly not representative. But it might be indicative.

Over 250 responses were given, of which just over 10 had to be ignored (because it seems that some teachers can’t grasp percentages, or can’t read questions!). Of the 240 responses used, the average figure for 2016 was 71% achieving EXS and 11% achieving GDS. Both of these figures are lower than last year’s national figures (74% / 15%) – which themselves seemed quite high, considering that just 5 years before, a similar percentage had managed to reach the old (apparently easier) Level 4 standard. Consequently, we might reasonably expect a greater increase in these schools results in 2017 – as the lower-attaining schools strive to get closer to last year’s averages.

Nevertheless, it does appear that the rise could be quite substantial. Across the group as a whole, the percentage of pupils achieving the expected standard rose by 4 percentage points (to just above last year’s national average), with the percentage achieving greater depth rising by a very similar amount (again, to just above last year’s national average).

We might expect this tendency towards the mean, and certainly that seems evident. Among those schools who fell short of the 74% last year, the median increase in percentage achieving expected was 8 percentage points; by contrast, for those who exceeded the 74% figure last year, the median change was a fall of 1 percentage point.

Now again, let me emphasise the caveats. This isn’t a representative sample at all – just a self-selecting group. And maybe if you’re in a school which did poorly last year and has pulled out all the stops this year, you’d be more likely to have responded, so it’s perfectly possible that this overestimates the national increase.

But equally, it’s possible that we’ll see an increase in teacher assessment scores which outstrips the increases in tested subjects – even though it’s already starting from a higher (some might say inflated) base.

I’m making a stab in the dark and predicting that we might see the proportion of children – nationally – reaching the Expected Standard in Writing reach 79% this year. Which is surely bonkers?

Stop moaning about tests!

Today marked the end of 4 short days of testing. For Year 6 pupils everywhere, they’ll have spent less than 5 hours on tests – probably not for the first time this year – and later in the year we’ll find out how they did.

Now, I’m the first to complain when assessment isn’t working, and there are lots of problems with KS2 assessment. Statutory Teacher Assessment is a joke; the stakes for schools – and especially headteachers – are ridiculously high; the grammar test is unnecessary for accountability and unnecessarily prescriptive. I certainly couldn’t be seen as an apologist for the DfE. And yet…

For some reason it appears that many primary teachers (particularly in Facebook groups, it seems) are cross that some of the tests contained hard questions. I’ve genuinely seen someone complain that their low-ability children can’t reach the expected standard. Surely that’s the very reason they’re defining them as low ability?

Plenty of people seem annoyed that some of the questions on the maths test were very challenging. Except, of course, we know that some children will score 100% each year, so the level of challenge seems fair. There were also plenty of easier, more accessible questions that allowed those less confident mathematicians to show what they can do. It’s worth remembering that to reach the expected standard last year, just 55% of marks were needed.

But the thing that annoys me most is the number of people seemingly complaining that the contexts for problem-solving questions make the questions too difficult. Of course they do, that’s the point: real maths doesn’t come in lists of questions on a page that follow a straightforward pattern. What makes it all the more irritating is that many of those bemoaning the contexts of problems are exactly the same sort who moan about a tables test, complaining that knowing facts isn’t worthwhile unless you can apply them.

Well guess what: kids need both. Arithmetic knowledge and skills need to be secure to allow children to focus their energies on tackling those more complex mathematical problems. You can’t campaign against the former, and then complain about the latter.

The tests need to – as much as possible – allow children across the ability range to demonstrate their skill, while differentiating between those who are more and less confident. That’s where last year’s reading test fell down: too few accessible elements and too many which almost no children could access. This year’s tests were fair and did a broadly good job of catering for that spread. For those complaining about the level of literacy required, it’s worth remembering that questions can be read to children, and indeed many will have had a 1:1 reader throughout.

No test will be perfect, and there are plenty of reasons to be aggrieved about the chaos that is primary assessment at the moment, but blaming tests because not all children can answer all questions is a nonsense, and we’d do well to pick our battles more carefully!

KS2 Writing: Moderated & Unmoderated Results

After the chaos of last year’s writing assessment arrangements, there have been many questions hanging over the results, one of which has been the difference between the results of schools which had their judgements moderated, and those which did not.

When the question was first raised, I was doubtful that it would show much difference. Indeed, back in July when questioned about it, I said as much:

At the time, I was of the view that LAs each trained teachers in their own authorities about how to apply the interim frameworks, and so most teachers within an LA would be working to the same expectations. As a result, while variations between LAs were to be expected (and clearly emerged), the variation within each authority should be less.

At a national level, it seems that the difference is relatively small. Having submitted Freedom of Information Requests to 151 Local Authorities in England, I now have responses from all but one of them. Among those results, the differences are around 3-4 percentage points:

moderated

Now, these results are not negligible, but it is worth bearing in mind that Local Authorities deliberately select schools for moderation based on their knowledge of them, so it may be reasonable to presume that a larger number of lower-attaining schools might form part of the moderated group.

The detail that has surprised me is the variation between authorities in the consistency of their results. Some Local Authority areas have substantial differences between the moderated and unmoderated schools. As Helen Ward has reported in her TES article this week, the large majority of authorities have results which were lower in moderated schools. Indeed, in 11 authorities, the difference is 10 or more percentage points for pupils working at the Expected Standard. By contrast, in a small number, it seems that moderated schools have ended up with higher results than their unmoderated neighbours.

What can we learn from this? Probably not a great deal that we didn’t already know. It’s hard to blame the Local Authorities: they can’t be responsible for the judgements made in schools they haven’t visited, and nor is it their fault that we were all left with such an unclear and unhelpful assessment system. All this data highlights is the chaos we all suffered – and may well suffer again in 2017.

To see how your Local Authority results compare, view the full table* of data here. It shows the proportions of pupils across the LA who were judged as working at the Expected and Greater Depth Standards in both moderated and unmoderated schools.


*Liverpool local authority claimed a right not to release their data on the grounds of commercial sensitivity, which I am appealing. I fully expect this to be released in due course and for it to be added here.

Some thoughts on the Primary Assessment Consultation

Pub Quiz question for the future: In what year did the primary assessment framework last not change? (Answers on a postcard, folks)

I may not always be the most complimentary about the DfE, but today I feel like there is a lot to praise in the new consultation on primary assessment. They have clearly listened to the profession, including the work undertaken by the NAHT assessment review, and have made some sensible suggestions for the future of primary assessment. As ever, I urge people to read the consultation, and respond over the next 12 weeks. Here, I’ve just shared a few thoughts on some key bits:

Assessment in the Early Years

For years, I feel like Early Years practice was held up as a shining example of assessment, as we were all wowed by their post-it notes and online apps, and all the photographs they took. I was never overly keen on all the evidence-collating, and I’m pleased that we’ve begun to eschew it in the Key Stages. It’s pleasing, therefore, to see that while the department is happy to keep the (actually quite popular) Early Years Profile, it wants advice on how the burden of assessment can be reduced in the Early Years.

I’m also pleased to see the revival of the idea of a Reception baseline. Much damage was done by the chaotic trial of different systems in 2015, but the principle remains a sensible one to my mind. I would much rather see schools judged on progress across the whole of the primary phase. It’s also quite right that baseline data shouldn’t be routinely published at school or individual level. The consultation seems open to good advice on how best to manage its introduction (an approach which might have led to greater success with the first attempt!).

Key Stage 1

I wasn’t certain that we’d ever persuade the DfE to let go of a statutory assessment, but it seems that they’re open to the idea. I do think that the KS1 tests – and the teacher assessment that goes along with them – are a barrier to good progress through the primary years, and I’d welcome their abandonment. The availability of non-statutory tests seems a sensible approach, and I’m happy to see that the department will consider sampling as a way to gather useful information at a national level. Perhaps we might see that rolled out more widely in the long term.

I’d have rather seen them take the completely radical option of scrapping the statutory tests straight away, but I can see the rationale for keeping them until the baseline is in place. Unfortunately that means we’re stuck with the unreliable Teacher Assessment approach for the time being. (More of that to follow)

Key Stage 2

Of course it makes sense to scrap statutory Teacher Assessment of Reading and Maths. Nobody pays it any heed; it serves no purpose but adds to workload. I’d have preferred to see Science go the same way, but no such luck. At the very least, I hope there is a radical overhaul of the detail in the Science statements which are currently unmanageable (and hence clearly lead to junk data in the extreme!)

There is also some recognition in there that the current system of Teacher Assessment of Writing is failing. The shorter term solution seems to be a re-writing of the interim frameworks to make them suit a best-fit model, which is, I suppose, an improvement. Longer term, the department is keen to investigate alternative (better) models; I imagine they’ll be looking closely at the trial of Comparative Judgement at www.sharingstandards.com this year. I’m less persuaded by the trial of peer-moderation, as I can’t quite see how you could ensure that a fair selection of examples are moderated. My experience of most inter-school moderation is that few discussions are had about real borderline cases, as few teachers want to take such risks when working with unfamiliar colleagues. Perhaps this trial will persuade me otherwise?

On the matter of the multiplication check, I don’t share the opposition to it that many others do. I’ve no objection to a sensible, low-stakes, no-accountability check being made available to support schools. I’d prefer to see it at the end of Year 4 – in line with the National Curriculum expectations, and I’d want to see more details of the trials, but overall, I can live with it.

Disappointments

Although it hardly gets mentioned, the opening statement that “it is right that the government sets a clear expected standard that pupils should attain by the end of primary school” suggests that the department is not willing to see the end of clunky descriptors like “Expected Standard”. That’s a shame, as the new scaled score system does that perfectly well without labelling in the same way. Hopefully future alternatives to the current Teacher Assessment frameworks might lessen the impact of such terminology.

Credit for whoever managed to get in the important fact that infant/junior and middle schools still exist. (Points deducted for failing to acknowledge first schools in the mix). However, the suggestions proposed are misguided. The consultation claims that,

the most logical measures for infant schools would be reception to key stage 1 and, for middle and junior schools, would be to continue with key stage 1 to key stage 2

While that may be true for infant, and potentially even junior schools, for middle schools this is a nonsense. Some middle schools only start from Year 6. How can it be sensible to judge their work on just 2 terms of a four-year key stage? The logical measure would require bespoke assessments on entry and exit. That would be expensive, so alternatives will be necessary. Personally I favour using just the Reception baseline and KS2 outcomes, along with sensible internal data for infant/first and junior/middle schools. The KS1 results have never been a helpful or reliable indicator.

Partly connected to that, I would also have liked to have seen a clearer commitment to the provision of a national assessment bank, as proposed by the Commission for Assessment without Levels, and supported by the NAHT review. It does get a brief mention in a footnote, so maybe there’s hope for it yet.

In Conclusion

Overall, I’m pleased with the broad shape of the consultation document. It does feel like a shift has happened within the department, and that there is a clear willingness to listen to the profession and correct earlier mistakes. There is as much positive news in the consultation as I might have hoped for.

If there were an interim assessment framework for judging DfE consultations, then this would have ticked nearly all of the boxes. Unfortunately, of course, nearly all is not enough, as any primary teacher knows, and so it must fall to WTS. Seems cruel, but he who lives by the sword…

Some clarity on KS2 Writing moderation … but not a lot

Not for the first time, the Department has decided to issue some clarification about the writing assessment framework at Key Stage 2 (and its moderation!). For some inexplicable reason, rather than sharing this clarity in writing, it has been produced as a slowly-worded video – as if it were us that were stupid!

Here’s my take on what it says:

Some Clarity – especially on punctuation

  • For Greater Depth, the long-winded bullet point about shifts in formality has to be seen in several pieces of work, with more than one shift within each of those pieces.
  • For Expected Standard, it is acceptable to have evidence of colons and semi-colons for introducing, and within, lists (i.e. not between clauses)
  • For Expected Standard, any of either brackets, dashes or commas are acceptable to show parenthesis. There is no need to show all three.
  • Bullet points are punctuation, but the DfE is pretending they’re not, so there’s no need to have evidence of them as part of the “full range” of punctuation needed for Greater Depth.
  • Three full stops to mark ellipsis are also punctuation, but again, the DfE has managed to redefine ellipsis in such a way that they’re not… so again, not needed for Greater Depth.

A bit of guidance on spelling

This was quite clear: if a teacher indicates that a spelling needs correcting by writing a comment in the margin on the relevant line, then the correction of that spelling cannot be counted as independent. If the comment to correct spellings comes at the end of a paragraph or whole piece, without specifying what to correct, then it can still count as independent.

No clarity whatsoever on ‘independence’

Believe me, I’ve re-watched this several times – and not all of them at double-speed – and I’m still bemused that they think this clarifies things. The whole debacle is still reliant on phrases like “over-scaffolding” and “over-detailed”. Of course, if things are over-detailed then there is too much detail. What isn’t any clearer is how much detail is too much detail. The video tells us that:

“success criteria would be considered over-detailed where the advice given directly shapes what pupils write by directing them to include specific words or phrases”

So we know specifying particular words is too much, but is it okay to use success criteria which include:

  • Use a varied range of sentence structures

Is it too specific to include this?

  • Use a varied range of sentence openers

What about…?

  • Use adverbs as sentence openers

There’s a wide gulf between the three examples above. Which of these is acceptable? Because if it’s the latter, then schools relying on the first will find themselves under-valuing work – and vice versa, of course. That’s before you even begin to consider the impossibility of telling what success criteria and other supporting examples are available in classrooms at the time of writing.

The video tries to help by adding:

“success criteria must not specifically direct pupils as to what to include or where to include something in their writing”

But all of those examples are telling children what to include – that’s the whole point of success criteria.

If I’ve understood correctly, I think all three of those examples are acceptable. But it shouldn’t matter what I think: if the whole system depends on what each of us thinks the guidance means, then the consistency necessary for fair and useful assessment is non-existent.

The whole issue remains a farce. Doubtless this year Writing results will rise, probably pushing them even higher above the results for the externally tested subjects. Doubtless results will vary widely across the country, with little or no relationship to success in the tested subjects. And doubtless moderation will be a haphazard affair with professionals doing their best to work within an incomprehensible framework.

And to think that people will lose their jobs over data that results from this nonsense!


The full video in all its 11-minute glory can be found at: https://www.youtube.com/watch?v=BQ-73l71hqQ

 

National Curriculum Test videos

I’ve updated the videos I made last year to explain the KS1 and KS2 tests to parents. As there is an option about using the Grammar, Punctuation & Spelling tests in primary schools, there are now two versions of the video for KS1 (one with, one without the GPS tests).

Please feel free to use these videos on your school’s website or social media channels, or in parent meetings, etc. There are MP4 versions available to download.

Key Stage 2

Re-tweetable version:

Facebook shareable version:
https://www.facebook.com/primarycurriculum/videos/1311921482187352/

Downloadable MP4 file: https://goo.gl/b0Lo9v

Key Stage 1 – version that includes the GPS tests

Re-tweetable version:

Facebook shareable version:
https://www.facebook.com/primarycurriculum/videos/1311921482187352/

Downloadable MP4 file: https://goo.gl/jo18qk

Key Stage 1 – version for schools not using the GPS tests

Re-tweetable version:

Facebook shareable version:
https://www.facebook.com/primarycurriculum/videos/1311921482187352/

Downloadable MP4 file:  https://goo.gl/xMDFSJ

The impossibility of Teacher Assessment

I’ve said for a fair while now that I’d like to see the end of statutory Teacher Assessment. It’s becoming a less unpopular thing to say, but I still don’t think it’s quite reached the point of popularity yet. But let me try, once again, to persuade you.

The current focus of my ire is the KS2 Writing assessment, partly because it’s the one I am most directly involved in (doing as a teacher, not designing the monstrosity!), and partly because it is the one with the highest stakes. But the issues are the same at KS1.

Firstly, let me be frank about this year’s KS2 Writing results: they’re nonsense! Almost to a man we all agreed last year that the expectations were too high; that the threshold was something closer to a Level 5 than a 4b; that the requirements for excessive grammatical features would lead to a negative impact on the quality of writing. And then somehow we ended up with 74% of children at the expected standard, more than in any other subject. It’s poppycock.

Some of that will be a result of intensive drilling, which won’t have improved writing that much. Some of it will be a result of a poor understanding of the frameworks, or accidental misuse of them. Some of it will be because of cheating. The real worry is that we hardly know which is which. And guidance released this year which is meant to make things clearer barely helps.

I carried out a poll over the last week asking people to consider various sets of success criteria and to decide whether they would be permitted under the new rules which state that

independent

So we need to decide what constitutes “over-aiding” pupils. At either end of the scale, that seems quite simple.Just short of 90% of responses (of 824) said that the following broad guidance would be fine:

1.png

Simplest criteria

Similarly, at the other extreme, 92% felt that the following ‘slow-writing’ type model would not fit within the definition of ‘independent’:

8

Slow writing approach

This is all very well, but in reality, few of us would use such criteria for assessed work. The grey area in the middle is where it becomes problematic. Take the following example:

5

The disputed middle ground

In this case results are a long way from agreement. 45% of responses said that it would be acceptable, 55% not. If half of schools eschew this level of detail and it is actually permitted, then their outcomes are bound to suffer. By contrast, if nearly half use it but it ought not be allowed, then perhaps their results will be inflated. Of course, a quarter of those schools maybe moderated which could lead to even those schools with over-generous interpretations of the rules suffering. There is no consistency here at all.

The STA will do their best to temper these issues, but I really think they are insurmountable. At last week’s Rising Stars conference on the tests, John McRoberts of the STA was quoted as explaining where the line should be drawn:

That advice does appear to clarify things (such that it seems the 45% were probably right in the example above), but it is far from solving the problem. For the guidance is full of such vague statements. It’s clear that I ought not to be telling children to use the word “anxiously”, but is it okay to tell them to open with an adverb while also having a display on the wall listing appropriate adverbs – including anxiously? After all, the guidance does say that:

guidance.png

Would that count as independent? What if my classroom display contained useful phrases for opening sentences for the particular genre we were writing? Would that still be independent?

The same problems apply in many contexts. For spelling children are meant to be able to spell words from the Y5/6 list. Is it still okay if they have the list permanently printed on their desks? If they’re trained to use the words in every piece?

What about peer-editing, which is also permitted? Is it okay if I send my brightest speller around the room to edit children’s work with them. Is that ‘independent’?

For an assessment to be a fair comparison of pupils across the country, the conditions under which work is produced must be as close to identical as possible, yet this is clearly impossible in this case.

Moderation isn’t a solution

The temptation is to say that Teacher Assessment can be robust if combined with moderation. But again, the flaws are too obvious. For a start, the cost of moderating all schools is likely to be prohibitive. But even if it were possible, it’s clear that a moderator cannot tell everything about how a piece of work was produced. Of course moderators will be able to see if all pupils use the same structure or sentence openers. But they won’t know what was on my classroom displays while the children were writing the work. They won’t know how much time was spent on peer-editing work before it made the final book version. They won’t be able to see whether or not teachers have pointed out the need for corrections, or whether each child had been given their own key phrases to learn by heart. Moderation is only any good at comparing judgements of the work in front of you, not of the conditions in which it was produced.

That’s not to imply that cheating is widespread. Far from it: I’ve already demonstrated that a good proportion of people will be wrong in their interpretations of the guidance in good faith. The system is almost impossible to be any other way.

The stakes are too high now. Too much rests on those few precious numbers. And while in an ideal world that wouldn’t be the case, we cannot expect teachers to provide accurate, meaningful and fair comparisons, while also judging them and their schools on the numbers they produce in the process.

Surely it’s madness to think otherwise?


For the results of all eight samples of success criteria, see this document.

 

A consistent inconsistency

With thanks to my headteacher for inadvertently providing the blog title.

With Justine Greening’s announcement yesterday we discovered that the DfE has definitely understood that all is not rosy in the primary assessment garden. And yet, we find ourselves looking at two more years of the broken system before anything changes. My Twitter timeline today has been filled with people outraged at the fact that the “big announcement” turned out to be “no change”.

I understand the rage entirely. And I certainly don’t think I’ve been shy about criticising the department’s chaotic organisation of the test and errors made. But I’m also not ready to throw my toys out of the pram just yet. This might just be the first evidence that the department is really listening. Yes, perhaps too little too late. Yes, it would have been nice for it to have been accompanied by an acknowledgement that the problems were caused by the pace of change enforced by ministers. But maybe they’re learning that lesson?

For a start, there are many teachers nationally who are just glad of the consistency. As my headteacher said earlier today, it leaves us with a consistent inconsistency. But nevertheless, there will be many teachers who are relieved to see that the system is going to be familiar for the next couple of years.

It’s a desire I can understand, but just can’t go along with. There are too many problems with the current system – mostly those surrounding the Teacher Assessment frameworks and moderation. But I will hang fire, because there is the prospect of change on the horizon.

It’s tempting to see it as meaningless consultation, but until we see the detail I don’t want to rule anything out. I hope that the department is listening to advice, and is open to recommendations – including those which the NAHT Assessment Reform Group of which I am a member is drawing together over this term.

If the DfE listens to the profession, and in the spring consults on a meaningful reform that brings about sensible assessment and accountability processes, then we may eventually come to see yesterday’s announcement as the least bad of the available options.

Of course, if they mess it up again, I’ll be on their case.

The potential of Comparative Judgement in primary

I have made no secret of my loathing of the Interim Assessment Frameworks, and the chaos surrounding primary assessment of late. I’ve also been quite open about a far less popular viewpoint: that we should give up on statutory Teacher Assessment. The chaos of the 2016 moderation process and outcomes was an extreme case, but it’s quite clear that the system cannot work.

It’s crazy that schools can be responsible for deciding the scores on which they will be judged. It has horrible effects on reliability of that data, and also creates pressure which has an impact on the integrity of teachers’ and leaders’ decisions. What’s more, as much as we would like for our judgements to be considered as accurate, the evidence points to a sad truth: humans (including teachers) are fallible. As a result, Teacher Assessment judgements are biased – before we even take into account the pressures of needing the right results for the school. Tests tend to be more objective.

However, it’s also fair to say that tests have their limitations. I happen to think that the model of Reading and Maths tests is not unreasonable. True, there were problems with this year’s, but the basic principles seems sound to me, so long as we remember that the statutory tests are about the accountability cycle, not about formative information. But even here there is a gap: the old Writing test was scrapped because of its failings.

That’s where Comparative Judgement has a potential role to play. But there is some work to be done in the profession for it to find its right place. Firstly we have to be clear about a couple of things:

  1. Statutory Assessment at the end of Key Stages is – and indeed should be – separate from the rest of assessment that happens in the classroom
  2. What we do to judge work, and how we report that to pupils and parents are – and should be – separate things.

Comparative Judgement is based on the broad idea of comparing lots of pieces of work until you have essentially sorted them into a rank order. That doesn’t mean that individuals’ ranks need be reported, any more than we routinely report raw scores to pupils and parents. It does, though, offer the potential of moving away from the hideous tick-box approach of the Interim Frameworks.

Teachers are understandably concerned by the idea of ranking, but it’s really not that different from how we previously judged writing. Most experienced Y2/Y6 teachers didn’t spend hours poring over the level descriptors, but rather used their knowledge of what they considered L2/L4 to look like, and judged whether they were looking at work that was better or worse. Comparative Judgement simply formalises this process.

It particularly tackles the issue that is particularly prevalent with the current interim arrangements: excellent writing which scores poorly because of a lack of dashes or hyphens (and poor writing which scores highly because it’s littered with them!). If we really want good writing to be judged “in the round”, then we cannot rely on simplistic and narrow criteria. Rather, we have to look at work more holistically – and Comparative Judgement can achieve that.

Rather than teachers spending hours poring over tick-lists and building portfolios of evidence, we would simply submit a number of pieces of work towards the end of Year 6 and they would be compared to others nationally. If the DfE really wants to, once they had been ranked in order, they could apply scaled scores to the general pattern, so that pupils received a scaled score just like the tests for their writing. The difference would be that instead of collecting a few marks for punctuation, and a few for modal verbs, the whole score would be based on the overall effect of the piece of writing. Equally, the rankings could be turned into “bands” that matched pupils who were “Working Towards” or “Working at Greater Depth”. Frankly, we could choose quite what was reported to pupils and parents; the key point is that we would be more fairly comparing pupils based on how good they were at writing, rather than how good they were at ticking off features from a list.

There are still issues to be resolved, such as exactly what pieces of writing schools would submit for judgement, and the tricky issue of quite how independent the work should be. Equally, the system doesn’t lend itself as easily to teachers being able to use the information formatively – but then, aren’t we always saying that we don’t want teachers to teach to the tests?

Certainly if we want children’s writing to be judged based on its broad effectiveness, and for our schools to be compared fairly for how well we have developed good writers, then it strikes me that it’s a lot better than what we have at the moment.


Dr Chris Wheadon and his team are carrying out a pilot project to look at how effective moderation could be in Year 6. Schools can find out more, and sign up to join the pilot (at a cost) at: https://www.sharingstandards.com/