For most aid policymakers the three tests for aid spending seem to be:
- can aid deliver the output or outcome?
- can these outputs or outcomes make a long term difference?
- can impact be demonstrated in a relatively short time period?
Question the questions
The effort required to deliver a convincing answer often dwarfs the effort put into thinking about whether it was the right question to answer in the first place (e.g. why has no-one asked whether conditional cash transfers make sense in sub-Saharan Africa? Most of the effort has been spent on answering if cash transfers have an impact on hunger. Of course they do.). This means consulting with a wide range of stakeholders about the questions worth asking (see "this book fills a much needed gap").
The "randomistas" and even the quasi-experimentalists often answer the question "does it have an impact?", sometimes "for who?", but rarely "why?". This requires innovative issue-driven blends of quant and qual.
Broaden the portfolio of what an be evaluated
It will always be more challenging to evaluate policy changes that generate consequences that are indirect and lagged, for which indicators are not yet developed, and for which all the pathways of change are not yet fully understood. But we must help to expand the set of things that are "evaluable" by developing methods to consider these dimensions.
Refine impact
Your idea of success is probably not the same as mine. Add in experiential, cultural and value differences and the gap may widen. Getting multiple views of success from different stakeholders will help to home in on what really counts and help uncover unintended consequence land mines further down the road.
Most methodologically rigorous analyses--whether quant/qual or some blend--have, by definition, high degrees of internal validity. They are fit for purpose in the context in which they are applied. But by building in more variation and diversity - either within the context or across it by, say, systematic case study meta analyses--we can make plausible guesses about how portable an intervention is and the associated risks of assuming it is.
In short the development research community must nuance the impact debate, not disown it.
If we do the latter, then aid will only be used for things where it is easiest to demonstrate impact and the rest will be wrung out of DFID, despite the ringed fence.
We need to expand the radius of evaluability if we are to help protect the parts of the aid spend that may do the most good.