Category Archives: primary

A few words on the 65% floor standard

There’s been much discussion about this in the last few days, so I thought I’d summarise a few thoughts.

Firstly, many people seem to think that the government will be forced to review the use of a 65% floor standard in light of the fact that only 53% of pupils nationally met the combined requirements. In fact, I’d argue the opposite: the fact that so few schools exceed the attainment element of the floor standard is no bad thing. Indeed, I’d prefer it if no such attainment element existed.

There will be schools for whom reaching 65% combined Reading, Writing & Maths attainment did not require an inordinate amount of work – and won’t necessarily represent great progress. Why should those schools escape further scrutiny just because they had well-prepared intakes? Of course, there will be others who have met the standard through outstanding teaching and learning… but they will have great progress measures too. The 65% threshold is inherently unfair on those schools working with the most challenging intakes and has no good purpose.

That’s why I welcomed the new progress measures. Yes it’s technical, and yes it’s annoying that we won’t have it for another couple of months, but it is a fairer representation of how well a school has achieved in educating its pupils – regardless of their prior attainment.

That said, there will be schools fretting about their low combined Reading, Writing & Maths scores. I carried out a survey immediately after results were released, and so far 548 schools have responded, sharing their combined RWM scores. From that (entirely unscientific self-selecting) group, just 28% of schools had reached the 65% attainment threshold. And the spread of results is quite broad – including schools at both 0% and 100%.

The graph below shows the spread of results with each colour showing a band of 1/5th of schools in the survey. Half of schools fell between 44% and 66%.

Combined attainment

Click to see full-size version

As I said on the day the results were published – for a huge number of schools, the progress measure will become all important this year. And for that, we just have to wait.

Edit:

Since posting, a few people have quite rightly raised the issue of junior/middle schools, who have far less control over the KS1 judgements (and indeed in middle schools, don’t even have control over the whole Key Stage). There are significant issues here about the comparability of KS1 data between infant/first schools and through primary schools (although not necessarily with the obvious conclusions). I do think that it’s a real problem that needs addressing: but I don’t think that the attainment floor standard does anything to address it, so it’s a separate – albeit important – issue.

Am I overstretching it…?

What are people’s thoughts?

Everyone  wants to know about progress measures, but we won’t have the national data until September. We can’t work it out in advance… but is it worth trying to estimate?

I collected data on Tuesday night about the SATs results, and my sample was within 1 percentage point of the final national figures, which wasn’t bad. However, this would be a much more significant project.

To get anything close to an estimate of national progress measures, we would need a substantial number of schools to share their school’s data at pupil level. It would mean schools sharing their KS1 and scaled score results for every pupil – anonymised of course, but detailed school data all the same.

My thinking at this stage is that I’d initially only share any findings with the schools that were able to contribute. It would be a small sample, but it might give us a very rough idea. Very rough.

Would it be useful… and do people think they would be able to contribute?

Consistency in Teacher Assessment?

I posted a survey with 10 hypothetical – but not uncommon – situations in which writing might take place in a classroom, and asked teachers to say whether or not they are permitted under the current guidance for “indepencence” when it comes to statutory assessment. It seems that mostly, we can’t agree:

(Click for a slightly larger/clearer version)

1000replies

 

Collecting KS2 data on Teacher Assessment

Having had over 100 schools respond to my plea to share data from KS1 Scaled Score tests, the next big issue on the horizon is the submission of Teacher Assessment data at the end of June.

In the hope of providing some sort of indication of a wider picture, I am now asking schools with Year 6 cohorts to share their data for Teacher Assessment this year, as well as comparison data for 2015. As with all the previous data collections, it won’t be conclusive, or even slightly reliable… but it will be something other than the vacuum that currently exists.

So, if you have a Year 6 cohort, please do share your Teacher Assessment judgements via the survey below:

 

Some initial thoughts on KS1 data

ks1warning

I started collecting data from test scores and teacher assessment judgements earlier this week. So far, around 80 schools (3500+ pupils) have shared their KS1 test scaled scores, and nearly 60 (nearly 2500 pupils) have shared their Teacher Assessment judgements (in the main three bands). So, what does it show?

Scaled Score Data

Despite – or perhaps because of – the concerns about the difficulty of the reading tests, it is Reading which has the highest “pass rate”, with 65.5% of pupils achieving 100 or greater. (Similarly, the median rate for schools was just over 65%)

Maths was not far behind, with 64.2% of pupils achieving 100 or greater, although the median score was slightly higher for schools, again at 65%. The results for GPS were lower (at around 57%), but this was based on a far smaller sample of schools, as many did not use the tests.

The spread of results can be seen approximately, by the proportion of schools falling within each band in the table below (click to enlarge)

testscores

For example, just 2% of schools have more than 90% of children achieving a scaled score of 100 in Reading, while 43% of schools had between 60-69% of children scoring 100+

Notably, the range in Maths results is slightly broader than in Reading.

Teacher Assessment Judgements

The order of success in the subjects remains the same in the collection of Teacher Assessment judgements, with Reading having the highest proportion of pupils reaching the expected standard or greater, closely followed by Maths – and Writing trailing some way behind. However, perhaps the most surprising difference (or perhaps not) is the fact that the proportions are all approximately 10% higher than the scaled score data.

According to teachers’ own assessment judgements, some 74% of pupils are reaching the expected standard or above in Reading, 73% in Maths, and around 68% in Writing.

Similarly, the spread of teacher assessment judgements shows more schools achieving higher proportions of children at the expected level – and includes one or two small schools achieving 100% at expected or above.

tascores

There are notable shifts at the bottom end. For example, 16% of schools had fewer than half of children achieve 100+ in Maths, whereas only 4% of schools have fewer than half of their children achieving the expected standard when it comes to Teacher Assessment.

It’s important to note that the data is not from the same schools, so any such comparisons are very unlikely to be accurate, but it does raise some interesting questions.

Greater Depth

Have I said we’re dealing with a small sample, etc? Just checking.

But of that small sample the proportions of pupils being judged as “Working at Greater Depth within the Expected Standard” are

Reading: 17%
Maths:     16%
Writing:   11%

More Data

Obviously there are many flaws with collecting data in this way, but it is of some interest while we await the national data. If you have a Year 2 cohort, please do consider sharing your data anonymously via the two forms below:

Collect Key Stage 1 data

By popular request, I am collecting data about both test scores and teacher assessment judgements for Key Stage 1. The intention is to provide colleagues with some very approximate indicative information about the spread of results in other schools.

As with previous exercises like this, it is important to warn that there is no real validity to this data. It isn’t a random sample of schools, it won’t be representative, it is easily corrupted, mistakes will often slip through… etc., etc.
But in the absence of anything else, please do share your data.

scoresI am collecting data in two forms. Firstly, test score data using scaled scores. These can be entered into the spreadsheet as with previous sample test score data. Please enter only scaled scores for your children. The spreadsheet can be accessed (without a password, etc.) at this link:

Share Key Stage 1 Test Scaled Score Data

tadataI am also collecting schools’ data on Teacher Assessment Judgements. To simplify this, I am collecting only percentages of children working at each of the three main bands in Key Stage 1. I am not collecting P-scales, pre-Key Stage or other data. For this, I have put together a Google Form which can be completed here:

Share Key Stage 1 Teacher Assessment Data

Please do read the instructions carefully on each form (you’d be amazed at how many foolish errors have been submitted previously through not doing so!)

Our Ofsted experience

I’m reliably assured that mentioning Ofsted is bound to get a spike in visits to one’s blog page, so let’s see.

About a month ago, we were thrilled to receive that lunchtime phone call that meant the wait was finally over. As any school with a ‘Requires Improvement’ label (or worse) will know, although perhaps never quite ‘welcome’, there comes a point where the Ofsted call is desired, if only to end the waiting. We wanted to get rid of the label, and so this was our chance.

We’d been “due” for a few months, but knew that it could be as late as the summer, so in the end, the second week after Easter didn’t seem so bad (particularly as it left us with a long weekend in the aftermath).OFSTED_good_logo3

So how did it go? Well, for those of you interested in grades, I am now the deputy headteacher of an officially GOOD school. It’s funny how that matters. Six weeks ago, I was just deputy of an unofficially good one.

But those of you still awaiting the call will be more interested in the process than the outcome, so let me start by saying that having spent the past 18 months building up my collection of “But Sean Harford says…” comments, I didn’t have to call upon it once. The team who visited us were exemplary in their execution of the process according to the new guidance and myth-busting in the handbook.

In the conversation on the day of the phone call, we covered practicalities, and provided some additional details to the lead inspector: timetables, a copy of our latest SEF (4 pages of brief notes – not War and Peace) and the like. And then we set about preparing. We had only just that week been collating teachers’ judgements of children’s current attainment into a new MIS, so it was a good opportunity for us to find out how it worked in practice!

We don’t keep reams of data, we don’t use “points of progress”, and we’ve gone to some length to avoid recreating levels. All for good reasons, but always aware that a ‘rogue’ team could find it hard to make snap judgements, and so make bad ones. The data we provided to the team was simple: proportions of each children in each year group who teachers considered were “on track” to meet, or exceed, end-of-Key-Stage expectations. We compared some key groups (gender, Pupil Premium, SEN) and that’s it. It could all fit on a piece of A4. So when it came to the inspection itself, there was a risk.

Day One

It may be a cliché to say it, but the inspection was definitely done with rather than to us. The first day included joint observations and feedback with the headteacher, as well as separate observations (we had a 3-person team). An inspector met with the SENCo, and the lead also met with English and Maths subject leaders (the former of which happens to be me!) and our EYFS leader.

The main question we were asked as subject leaders was entirely sensible and reasonable: what had we done to improve our subjects in the school? I think we both managed to answer the “why?” and “what impact?” in our responses, so further detail wasn’t sought there, but it was clear that impact was key.

Book Scrutiny

The afternoon of the first day was given over to book scrutiny. We provided books from across the ability range in the core subjects, as well as ‘theme’ books for each team. The scrutiny focused most closely on Years 2, 4 and 6, which fits both with the way we structure our classes and our curriculum and assessment approach. Alongside books, we provided print-outs for some children that showed our judgements on our internal tracking system. I’m not sure whether the focus was set out as clearly as this, but my perception of the scrutiny (with which both my headteacher and I were involved) was that the team were looking at:

  • Was the work of an appropriate standard for the age of the children? (including content, presentation, etc.)
  • Was there marking that was in line with the school’s policy? (one inspector described our marking – positively – as “no frills”, which I quite liked)
  • Was there evidence that children were making progress at an appropriate rate for their starting points?

They asked for the feedback policy in advance, and made connection to it briefly, but the focus on marking was mainly on checking that it met what we said we did, and that where it was used, it helped lead to progress. Some pages in books were unmarked. Some comments were brief. Not all had direct responses – but there was evidence that feedback was supporting progression.

Being involved in the process meant that we could provide context (‘Yes, this piece does look amazing but was quite heavily structured; here’s the independent follow-up’; ‘Yes, there is a heavy focus on number, but that’s how our curriculum is deliberately structured’, etc.). But it also meant a lot of awkward watching and wondering  – particularly when one inspector was looking closely at the books from my class!

The meeting at the end of the first day was a reasoned wander through the framework to identify where judgements were heading and what additional information might be needed. We were aware of one lower-attaining cohort, which was identified, so offered some further evidence from their peers to support our judgements. There was more teaching to be seen to complete the evidence needed for that. And there was one important question about assessment.

Assessment without levels

I had expected it. Assessment is so much more difficult for inspectors to keep on top of in the new world, and so I fully expected to have to explain things in more detail than in the past. But I was also slightly fearful of how it might be received. I needn’t have been this time. The question was perfectly sensible: our key metric is about children being “on track”, so how do we ensure that those who are not on-track (and not even close) are also making good progress?

That’s a good question; indeed it might even have been remiss not to have asked it! We were happily able to provide examples of books for specific children, along with our assessments recorded in our tracker to show exactly what they were able to do now that they couldn’t do at the end of last academic year. It gave a good opportunity to show how we focus classroom assessment on what children can and can’t do and adapt our teaching accordingly; far more important than the big picture figures.

Day Two

On the second day I observed a teacher alongside the lead inspector, and was again pleased by the experience. Like all lessons, not everything when perfectly to plan, but when I reported my thoughts afterwards, we had a sensible discussion about the intentions of the lesson and what had been achieved, recognising that the deviation from the initial plan was good and proper in the circumstance. There was no sense of inspectors trying to catch anyone out.

Many of the other activities were as you’d expect: conversations with children and listening to readers (neither of which we were involved in, but I presume they acquitted themselves well); meeting with a group of governors (which I also wasn’t involved in, but they seem to acquit themselves well too J); a conversation about SMSC and British Values (with a brief tour to look at examples of evidence around the school); watching assembly, etc.

Then, on the afternoon of day two we sat with the inspection team as they went through their deliberation about the final judgements. In some ways it’s both fascinating and torturous to be a witness in the process – but surely better than the alternative of not being!

As with any good outcome, we got the result we felt we were due (and deserved), and areas for feedback that aligned with what was already identified on our development plan for the forthcoming year. The feedback was constructive, formative, and didn’t attempt to solve problems that didn’t exist.

And then we went to the pub!

Year 6 Sample Test data – another update

As people continue to share their data via the spreadsheets, I thought it was about time to do one final update of summarised data based on the most recent shared results.

As we’re so close to the tests now, I have looked only at the data collected from tests taken since February half-term. That keeps it as a good sample size of 3500-4000 pupils in each case, while discounting the much older results. I don’t have enough data yet from Summer 1 alone to make much of it.

Key Stage 2 DfE Sample tests

Subject Mean Average Score Median Score Interquartile range
Reading 30 marks 31 marks 24 – 37 marks
Grammar, Punctuation & Spelling 36 marks 37 marks 28 – 46 marks
Mathematics 64 marks 65 marks 46 – 85 marks

The shifts are not that significant in reading and GPS (up by a mark each for the averages), but the maths average seems to have risen quite a bit (it was 59 marks just a month ago)

If you’ve recently used the tests in your school, please do add your data to the collection by entering into the spreadsheet here: Year 6 Sample Test data collection

You can access a spreadsheet to present your own school’s results in a comparison table and graphs alongside all of the national data collected so far (since January). Access the comparison spreadsheet here: Key Stage 2 Comparisons

The problems with the Interim Assessment Framework

Earlier this evening I posted a poll on Twitter:

The high “yes” response from primary teachers was little surprise to me, despite my having spent a good amount of time over the last couple of years trying to persuade people otherwise.

One thing that sets us apart from secondary colleagues, I think, is the introduction of the Interim Assessment Frameworks for Year 2 and Year 6. Where teachers might otherwise have felt excited about the prospect of moving away from levels, these frameworks have undone any positivity about the shift, as they present all the age-old issues with levels and then some more on top.

I have written briefly about this in the past, but let me set it out in more detail:

Banding

One of the issues with levels was that children of very different capabilities were categorised into a single group of “level 3”, or somesuch band. As a result, there was a huge focus on pupils near a threshold, a rush to push pupils through minimal content required to move into the next band, and a wide variety of information lost as it was translated into a single number.
All of these problems are replicated when you look at the new “Working Towards”, “Working At” and “Greater Depth” bands that replace them.

Vagueness

Daisy Christodoulou has written clearly on the problems of adverbs. While there are few adverbs in the new frameworks, we still have statements to judge such as whether pupils read “age-appropriate books”, or use “noun phrases effectively”. Though the exemplification materials try to illustrate some of these points, the subjectivity remains a significant issue.

Independence

One of the most challenging aspects of the new approach is the measure of ‘independence’. Having spent years hearing from secondary colleagues and the department what a nightmare Controlled Assessment was, we now have a virtually identical problem forced upon us at KS2, including the freedom for children to re-write work “after discussion with the teacher”. What nonsense that children should be able to spell “most words correctly (year 5 and 6)”  – whatever that might mean – but that at the same time they are permitted to use dictionaries and word banks in their efforts to do so.

Content

All of these issues are bad enough, but when added to the content it gets ridiculous. Many will be aware of the nonsense of expecting seven-year-olds to write like Enid Blyton characters when using exclamatory sentences. Cue teachers up and down the country recounting Little Red Riding Hood at length.
The situation is replicated at Key Stage 2, where teachers nationwide are now finding ways to force barely competent writers to use hyphens and dashes in their work to provide evidence of meeting the expected standard.
To make matters worse, the Writing criteria at both stages barely mention the need for any purposeful composition or understanding of audience & purpose. They might as well have stuck with the results from the grammar test.
[Note that my criticism is not about the raising of expectations, but the narrowness of them]

Workload

If a teacher did exactly as directed by the frameworks, and checked every child’s work for evidence of each statement, they would find themselves making well over 1500 judgements per class. It simply isn’t manageable. And while in the past teachers could grasp a sense of “levelness” of work, now the need to ensure that every 11-year-old has used at least one hyphen in their writing has led to a burdensome approach.

Transition

One of the key aims of any assessment at the end of primary school ought to be to provide useful information to secondary colleagues. Yet the bandings used do little to support this. A pupil who can’t yet explain how fossils explain evolution would be deemed “not meeting the expected standard”. Little regard is given to their other wide-ranging knowledge that secondary teachers might draw upon.

Parents

We were repeatedly told that the old system of levels didn’t help parents to understand their children’s learning and needs. But the new framework makes that problem worse. After all, save for a few avid primary colleagues, who could reasonably put these descriptors in order, let alone know what each entails?

sequence

Could you put these in ascending order of ability?

Pace

Perhaps the most significant issue to explain the poll results is the pace at which change has been rushed through. Not just for teachers, but clearly for the department too. Deadlines have repeatedly been pushed back, and exemplification materials coming out just weeks before final judgements are due is unacceptable.
The shift between systems mid-key stage has also exacerbated problems.The change in expectations for the expected standard at a time when pupils were half-way through the Key Stage has been unhelpful and unfair. It has undoubtedly led to many teachers wishing no change had been made at all.

What should have happened?

Changing a whole assessment system is big work. To be done well it needs time and thought. If we were going to be stuck with an interim system, then better to have kept a development of levels in the interim. It would have been perfectly possible to add some pre-requisites to the existing level descriptors as a transition document while the finer detail was worked out. But better still, a roll-out of the new curriculum that saw the new tests introduced at the end of a full key-stage would have been much fairer and much easier to organise.

Now, as it stands, there is more damage to undo, and the poll at the top illustrates it only too well.

 

One-page markscheme for KS2 GPS test

Just a quick post to share a resource.

As I plough through marking the 49 questions of the KS2 sample Grammar test, I find keep flicking back and forth in the booklet a nuisance, so I’ve condensed the markscheme into a single page document.

You’ll still want the markscheme to hand for those fiddly queries, but it means a quicker race through for the majority of easy-to-mark questions. For each question, where there are tickboxes I’ve just indicated which number box should be ticked; where words should be circled/underlined I’ve noted the relevant words. For grid questions, I’ve copied a miniature grid into the markscheme.

Feel free to share: One-page GPS markscheme

Of course, once you’ve marked the tests, please also share your data with me so we can start to build a picture of the national spread of results – see my previous blog.