The effects and impact of AfL

Since Black and Wiliam published ‘Inside the Black Box’ in 1998, AfL strategies have dominated a great deal of professional development time, had significant influence on national education policy and has become an unquestioned feature of ‘good practice’.

However, to what extent does empirical evidence support the effectiveness of AfL strategies as implemented in schools? It turns out that this is a complex question to answer – but a broad answer might be ‘less than you think’.

One problem is that even teaching strategies rated as very effective are highly sensitive to context. For example, in Marzano’s review of effective teaching strategies he says:

“Feedback was one of the highest ranked instructional strategies. … But, an examination of the research on feedback indicates that even this “very high-yield” strategy doesn’t always work to enhance student achievement. For example, when Kluger and DeNisi (1996) examined the results of 607 experimental/ control studies on the effect of feedback, they found that over 30% of them showed a negative effect on student achievement.

Where positive evaluations of AfL appear in the literature, there are serious questions about the quality of the evidence these assessments are based on. For example, in a new review of the effects and impact of AfL the authors suggest there is a great deal of indirect evidence supporting AfL, for example:

“There is consistent evidence that participants in AfL initiatives attribute an increase in students’ achievement to the implementation of AfL or of one of the strategies related to it.”

Source: CfBT (2013) Assessment for learning: effects and impact
However, many of the conclusions about the efficacy of AfL appear to be based on rather subjective measures of outcome. Expectation effects and confirmation bias could easily distort the perceived effectiveness of AfL strategies – especially when judged by the teachers implementing them! Additionally, there are usually so many initiatives going on in a school at any one time, perceived positive effects may be due to a completely different innovations!

So – are there any well-focused studies which show clear evidence of the effectiveness of AfL using quantitative measures? From the same review:

“There is only one quantitative study that has been conducted which was clearly and completely centred on studying the effect of AfL on student outcomes. This produced a significant, but modest, mean effect size of 0.32 in favour of AfL as being responsible for improving students’ results in externally mandated examinations.”

Ok, not a huge effect size (Hattie suggests using an effect size of 0.4 as a benchmark of high value effectiveness – but it’s about the same as 0.29 for homework), but a positive effect in a well-focused quantitative study . However, they go on to say …

“It must be mentioned, however, that this study has some methodological problems, explicitly recognised by their authors. These are related to the diversity of control groups they considered and the variety of tests included for measuring students’ achievement. All this affects the robustness of comparisons within the study.”

So, the answer appears to be that there is no genuinely robust evidence to support the effectiveness of AfL strategies.

One reason for the mixed results on the effectiveness of AfL might be the way that it has been implemented. There are reasons to believe that the national strategy and many of the methods taught to teachers were not in line with the original strategy (e.g. it was poorly implemented, it has become gimmicky, highly bureaucratised).

Another possibility might be that AfL is implemented in a way that makes teacher compliance easy to measure for SLT, but actually undermines the effectiveness of the strategy. For example, requiring teachers to report ‘progress’ through (non-existent) sub-levels and generating targets by which a student will reach the next (non-existent) sub-level.

Lastly, there is a genuine question as to whether the ‘core elements’ of AfL strategies as originally presented by Wiliam and Black – placing the student at the centre of learning through self-assessment and independent learning – may contradict what cognitive scientists are saying about the way children learn. Some of the psychological evidence emerging from Europe and the US appears to directly contradict some of the claims and aims of AfL.

So where next? Well, perhaps as a profession, we should stop, reflect and start questioning received wisdom – a good example is set by the Learning Spy in these recent blog articles:

Getting feedback right Questions about questioning

AfL (in some form or another) may well have utility in education and no one wants to throw out the baby with the bath water. However, we should be sceptical of claims about the unbridled benefits of AfL until we can answer some of Marzano’s questions:

“Are some instructional strategies more effective in certain subject areas? Are some instructional strategies more effective at certain grade levels? Are some instructional strategies more effective with students from different backgrounds? Are some instructional strategies more effective with students of different aptitudes?” … “Until we find the answers to the preceding questions, teachers should rely on their knowledge of their students, their subject matter, and their situations to identify the most appropriate instructional strategies”

This entry was posted in General teaching and tagged , , . Bookmark the permalink.

9 Responses to The effects and impact of AfL

  1. David Didau says:

    Reblogged this on The Echo Chamber


  2. maxinegoodes says:

    I definitely agree with the fact that many teaching strategies are sensitive to the context in which they are used and are very difficult to measure for effectiveness, in isolation. There are so many variable factors within a school setting that I think it makes it very difficult to measure one particular strategy and say conclusively that this strategy is the determining factor in accelerating progress in learning. This is not to say that afl could have a positive effect. I agree with the point made about how the National Strategy has deviated from the original AFL intention and this could have reduced the effectiveness in terms of student progress. Student progress is not necessarily as a result of a one size fits all strategy but a multitude of factors sensitive to context. I don’t think it is very easy to quantify how exactly a student makes progress and therefore replicate studies and generalise so all teachers follow a formulaic path and all students progress in the same way. Human beings are not robots! Although I do not dismiss the fact that there are certain features that make a great lesson and formative marking is one of these, however it does not necessarily mean that every student in every class will respond in the same way!


  3. Pingback: Do It Yourself! | teaching personally

  4. Pingback: Growth mindset: It’s not magic |

  5. Dylan Wiliam says:

    At the risk of repeating myself, 0.32 is not a modest effect size. Describing an effect size of 0.3 as “modest” is based on the work of Jacob Cohen, who, rather stupidly in my view, thought that one could evaluate the significance of an effect size independently of the context (Russell Lenth calls such an approach “tee-shirt” effect sizes: small, medium and large). For secondary school students, one year’s growth is around 0.3 to 0.4 standard deviations, so an effect size of 0.32 would represent an 80% increase in the rate of learning. Hattie’s adoption of 0.4 as a “baseline” for effect sizes is presumably based on the same evidence, but if the effect size is in addition to the annual growth (as it was in the study being cited) then it is a genuine increase.

    And as far as I am aware, nothing in what I have said about formative assessment is incompatible with the latest cognitive psychology research on the benefits of direct instruction, spaced learning, frequent testing. If anyone thinks that there is any research that contradicts the basic premise of formative assessment—that learning is better when teachers, peers and students use evidence about what has been learned to improve future learning—I would be very interested to see it.


    • Thanks for commenting! I won’t argue about the relative size or importance of the effect size associated with AfL in your 2004 study – I’ve become much more sceptical of mean effect sizes as evidence for the effectiveness of educational strategies since I wrote this (I thoroughly enjoyed your talk on this at researchED2014 by the way!). It was more the apparent lack of robust evidence for the effectiveness of such a mainstream educational strategy that surprised me.

      As to the elements of formative assessment – as described by yourself and Paul Black – which might be considered somewhat ‘at odds’ with some of the recent revival of interest in cognitive psychology, I’ll respond with a quote from Inside the Black Box:

      “Underlying such problems will be two basic issues. The one is the nature of each teacher’s beliefs about learning. If the assumption is that knowledge is to be transmitted and learnt, that understanding will develop later, and that clarity of exposition accompanied by rewards for patient reception are the essentials of good teaching, then formative assessment is hardly necessary. If however, teachers accept the wealth of evidence that this transmission model does not work, even by its own criteria, then the commitment must be to teaching through interaction to develop each pupil’s power to incorporate new facts and ideas into his or her understanding. Then formative assessment is an essential component …” p9

      Perhaps it was not an intention, but AfL has certainly been used within education as an argument against more traditional forms of instruction. The CfBT review of AfL lists the main goals of formative assessment as …
      “• changing traditional classroom practices” p8

      However, I agree that despite the apparent emphasis given to student-centred learning in many descriptions of formative assessment, effective formative strategies don’t *have* to contradict more traditional, teacher-led forms of teaching. Thus I argued ‘not throwing out the baby when changing the bath water’ – and still do – there is still a great deal of merit in the concept of formative assessment!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s