How to know you have hit upon a very controversial subject: two titans of development economics each castigate you for diametrically opposite reasons. Next time, I should let them fight directly!
I am trying to use AidGrade’s data to say something about the generalizability of impact evaluation results. I’m not coming in with an agenda, but basing this on the belief that:
1) People want to know what works. There are a lot of grandiose claims that impact evaluation can tell us this. Economists are usually very careful not to generalize from particular cases, knowing that results are heavily context-dependent and have no external validity. But there is also a sense in which we really do want to use the results to update our priors. We want to get something generalizable out of an impact evaluation, else why do one in the first place if it only tells us how successful something which will never again occur was?
The extent to which results are generalizable is an empirical question. So long as people are extrapolating from past results, whether explicitly or with a wink and a nudge when trying to get policy makers to agree to a new impact evaluation, we’d better at least know how generalizable the results are. You can say we know they aren’t generalizable. Fine. People still talk as though they are, they are likely to be to some non-zero degree, so what is that degree?
2) There are undoubtedly contexts under which results are more or less generalizable in practice. For example, since a lot of people don’t want to randomize, I suspect that RCTs may be done in “weirder” situations than quasi-experimental studies. I wonder if we can see this in the results. Some causal chains between interventions and outcomes may also be more complicated than others. The theory here is quite clear – why not test it out?
Unfortunately, I’m stuck between a rock and a hard place. Some say it goes too far, others not far enough. I’m a fan of impact evaluations for what they do tell us about human behaviour. I also think they are often vastly overpriced and that many but not all of them would be more helpful were they to give more immediate, actionable feedback to the project implementer.
I’m not actually interested in participating in the War of the Randomistas. But when a war goes on, it seems that some on either side have hammers and everything looks like a nail.
If special interests kill it, so be it, but practically speaking it’s an important issue.