How (not) to do evaluation: A guide to the UK’s favorite policy ritual
Evaluation is feted in modern policy making, but is still poorly understood, under-resourced, and misapplied, says Tanya Singh
If public policy in the UK had a religion, “evidence-based decision-making” would be its mantra, and evaluation its most sacred rite. And yet for something so revered, evaluation in British policymaking is often poorly understood, under-resourced, and misapplied.
The central delusion is this: evaluation is a technical process for establishing “what works.” In reality, it is political, messy, and profoundly shaped by who asks the question, who controls the data, and who gets to apply the answers. I recently stumbled on the OECD’s review of evaluation systems in five countries, in which the UK is praised for “flexibility” but quietly dinged for its lack of a structured framework, inconsistent uptake across departments, and weak local capacity.
Concerns about evaluation are hardly new. In fact, as Jill Rutter noted in her report for the Institute for Government in 2012, despite years of reform and rhetoric, evaluation, review and learning were rated by both ministers and civil servants as the worst-performing aspect of policymaking. Remarkably, even after 12 years, her reflections feel as relevant as ever — as if little has changed. So, why does this issue persist?
Paper trails, not policy change
Evaluation in the UK is driven more by upward accountability and performance reporting than by a genuine desire to learn. Too often, it becomes a tick-box exercise, shaped by a culture that rewards defensibility over discovery. When funding is tied to predefined outcomes, there’s little incentive to test, adapt, or take risks on novel approaches.
At the local level, the picture is characterised by its own set of challenges. Councils and strategic authorities are often required to “embed evaluation” into policy bids to Whitehall. But what this means in practice is hiring consultants to produce retrofitted theories of change, abstract log frames, or hastily planned impact assessments. Too often, places are asked to navigate a maze of centrally mandated requirements — with limited flexibility, little feedback, and few tangible benefits. The incentive is not to learn but to comply.
Added to this is a lack of standardisation — and that needn’t mean a rigid, top-down model or a disregard for local context. The Evaluation Task Force, a joint unit of the Cabinet Office and HM Treasury, offers guidance (notably the Magenta Book) and support. But as the OECD points out, there is currently little meaningful coordination or dialogue between central and local actors on evaluation. The result is a fragmented, incoherent system — one that lacks consistent standards of quality, purpose, or practice.
There’s no denying the scale of the methodological challenge either. Quantitative evaluation has a vital role in understanding impact, but it comes with real limitations. Robust approaches often demand long timeframes, large sample sizes, and stable delivery conditions — all of which are in short supply in the messy, fast-moving world of public policy. Teams frequently lack access to timely, granular data, or the capacity to analyse it well.
What is more, the timeframes for demonstrating impact are also mismatched with reality. Even when evaluations are done to a high standard, findings can take years, or even decades, to emerge — by which time programmes have changed, funding has shifted, or institutional memory has faded. The result is a mismatch between the promise of rigour and the practical realities of delivering and learning from policy in real time: think of Sure Start and the related findings that only became clear a decade on, long after the program had been scaled back.
Even when evaluations are done well, the evidence on how learnings are put into practice is mixed. A National Audit Office review found that only 40 of 261 policy impact assessments done in 2009-10 by government departments referenced any evaluation evidence, and just a small proportion of spending review bids across departments drew on past evaluations to justify funding decisions. Without stronger links between evaluation and decision-making, impact claims go untested, and lessons go unlearned.
This is not to say it’s all gloomy. As with any system, there are bright spots — innovative learning partnerships, experiments with participatory evaluation, serious work in some of the ‘What Works’ Centres, and place-based efforts to get to the heart of what has worked, such as Manchester Metropolitan University’s evaluation of Greater Manchester’s Good Employment Charter. But too often, these remain isolated instances rather than embedded practice, with limited influence on the wider culture of policymaking.
From compliance to learning
What we’ve ended up with is a performative culture of evaluation. The narrative is that the system is learning, adapting, and improving -but few policies are genuinely tested, and fewer still are changed as a result.
The first step is honesty: admitting that the current model isn’t delivering as it should. Evaluation should be a tool for localised learning and adaptation, not just central performance monitoring. That means the freedom to ask, “what if we’re wrong?”. Most of all, we need to shift the mindset from evaluation as compliance to evaluation as learning. Because right now, UK evaluation risks being what the OECD politely calls “symbolic.” Or to put it less politely: we’re spending millions to prove we don’t really want to know the answer.
Concurrently, there is a strong need to align incentives and build evaluation into policy from the start andnot as an afterthought. Support departments, especially at the local level, to plan and commission evaluations that are timely, proportionate, and useful. And train civil servants to move beyond plausible fiction toward evidence that challenges as well as confirms.
As we evolve into the Growth and Reform Network — a cohort of strategic and local authorities focused on inclusive growth and public service reform — we’ll be working with members to explore these challenges and reimagine evaluation not as a bureaucratic hoop to jump through, but as a practical tool for learning, adaptation, and accountability.
__
Image by Kaleideco on Unsplash
GRN blogs and insights
Browse other GRN blogs and insights in inclusive growth and public service reform across the UK: