The myth of a 0.4SD effect size and a year's worth of a progress — Evidence-based school leadership and management: A practical guide

In the last week you will probably have been in a staff development session where the presenter – be it a senior leader, school research lead or consultant – will have made some reference to effect size. Indeed, there is a very strong chance - particularly if the speaker is an advocate of work of Professor John Hattie and Visible Learning - that they will make reference to a 0.4 SD effect size as the average expected effect size for one year of progress in school (Hattie, 2015). In other words, over the course of an academic you should expect your pupils to make at least 0.4 SD of progress. Unfortunately, although there is some appeal in having some form of simple numerical measure to represent a year’s worth of progress, it is not quite that simple and is potentially highly misleading.

(Wiliam, 2016) states that when working out the standardised effect size in an experiment, this is quite simple the difference between the mean of the experimental group and the mean of the control group, divided by standard deviation of the population. However, as the standard deviation for the achievement of older pupils tends to be greater than for younger pupils, this means that with all other things being equal you would expect small standardised effect sizes for experiments involving older pupils than for experiments with younger pupils.

Wiliam then goes onto cite the work of (Bloom, Hill, Black, and Lipsey, 2008) which looked at the annual progress made by pupils. Using a number of standardised assessments, Bloom et al looked at the differences in scores achieved by pupils from one year to the next, and then divided this by the pooled standard deviations – which allowed to calculate the effect size for a year’s worth of teaching. They found that for six-year olds a year’s worth of growth is approximately 1.5 standard deviations, whereas for twelve year olds a year’s worth of growth was 0.2 standard deviations. As such, although average growth for school pupils may be approximately, 0.4 standard deviations, this average is largely meaningless and has little or no value.

Elsewhere, in the Sutton Trust and EEF’s Teaching and Learning Toolkit manual (EEF, 2018)they use the assumption that 1 year’s worth of progress is equivalent to one standard deviation. However, the EEF recognise that the notion of one standard deviation representing one year’s progress does not hold for all ages. For example, data from National Curriculum tests indicates annual progress of about 0.8 of a standard deviation at age 7, falling to 0.7 at 11 and 0.4 at age 14.

In another study, (Luyten, Merrell and Tymms, 2017)looked at the impact of schooling on 3500 pupils in Year 1 – 6 in 20 – predominately private sector - English primary schools. They found that the year to year gains of schooling declined as pupils got older. For example, for the youngest pupils the effect size for progress in Reading was 1.15 standard deviations, whereas for the oldest pupils the effect size for year to year progress was 0.49 standard deviations. This declining trend in effect size was also seen for measures of Mental Maths and Developed Ability, although General Maths deviates from this pattern, with effect sizes in general being consistent from year to year.

Discussion and implications

First, it’s important to remember that this analysis has focussed on year to year progress made by groups pupils. It has not looked at the impact of specific interventions or group effect sizes. As the 0.4 SD effect size for one year’s progress, should not be confused or conflated with the 0.4 SD effect size put forward by Professor Hattie as the average effect size for factors influencing education.

Second, if the presenter of your staff development session is not aware of the issues raised by this post – you may want to very professionally point them in the direction of the references listed at the end of this post. This is not about embarrassing senior colleagues or guests by saying that they are wrong, but more it’s about saying that the claims they are making are not uncontested.

Third, given how effect size vary with both age and the diversity of population, it suggests that any attempts by teachers to make judgments about their effectiveness of teaching by calculating effect sizes may be seriously flawed. For primary school teachers there is a risk that they will overestimate their effectiveness, whereas for secondary school teachers the opposite is true. Indeed, given some critical issues with effect sizes – see (Simpson, 2017)and (Simpson, 2018)it’s probably wise for individual teachers to steer clear of their calculation.

Finally, this blog raises all sorts of issues about when to trust the experts (Willingham, 2012). In this blog you have an edublogger challenging the claims made by world renowned educational researcher. It may be that I have misunderstood the claims by Professor Hattie. It maybe that I have misunderstood the arguments of Dylan Wiliam, Hans Luyten, Steve Higgins the EFF and others. However, what it does suggest is that it is maybe unwise to rely upon a single expert, instead, particularly in education it’s worth making sure your evidence-informed practice is influenced by a range of experts.

References

EEF (2018) Sutton Trust - EEF Teaching and Learning Toolkit & EEF Early Years Toolkit - Technical Appendix and Process Manual (working document v.01).

Hattie, J. (2015) What doesn’t work in education: The politics of distraction. Pearson.

Kraft, M. A. (2018) Interpreting effect sizes of education interventions. Brown University Working Paper. Downloaded Tuesday, April 16, 2019, from ….

Luyten, H., Merrell, C. and Tymms, P. (2017) ‘The contribution of schooling to learning gains of pupils in Years 1 to 6’, School effectiveness and school improvement. Taylor & Francis, 28(3), pp. 374–405.

Simpson, A. (2017) ‘The misdirection of public policy: comparing and combining standardised effect sizes’, Journal of Education Policy. Routledge, pp. 1–17. doi: 10.1080/02680939.2017.1280183.

Simpson, A. (2018) ‘Princesses are bigger than elephants: Effect size as a category error in evidence‐based education’, British Educational Research Journal. Wiley Online Library, 44(5), pp. 897–913.

Wiliam, D. (2016) Leadership for teacher learning. West Palm Beach: Learning Sciences International.

Willingham, D. (2012) When can you trust the experts: How to tell good science from bad in education. San Francisco: John Wiley & Sons.