E conomists or, more in general social scientists, have two main priorities in carrying out their research.
First, they want to investigate causal relationships: despite social and economic world’s complexity, establishing causes and effects is at the core of social science. Understanding the returns to education, labour market frictions, the role of information on financial decisions, the impact of technology on productivity are just few examples of topics where economists look at causality. Second, by pursuing the first point, social scientists wish that the new knowledge, based on theory and empirical evidence, generates changes and improvements in the way the world evolves: more effective policies and regulations, more functioning markets, improved social welfare etc.
However, the isolation, or identification, of causes from their effects is not a simple task and has been the object of decades of methodological attempts. The main obstacle in measuring a causal relationship, beyond the mere correlation, is to identify the counterfactual. In order to assess the causal impact of a situation or intervention on the outcome of interest, one would like to compare two “states of the world”. One in which the situation or intervention is in place, the other in which it is not in place. The latter is the counterfactual and answers the question of what would have happened if the intervention had not been in place. The difference in the outcome between the factual and the counterfactual situation represents the impact which can be causally attributable to the intervention, all other things being equal by construction. However, such exercise is not feasible in the real world, where only one state can be actually observed, i.e. whether the subject receives the intervention or not. For longtime economists have been writing theoretical models and tested them against observational data through several empirical methodologies allowing to claim causal relationships, after making a certain number of assumptions on the counterfactual.
Starting from the 90’s an increasing number of economists tried to overcome the problem of the counterfactual, by borrowing the laboratory experimental method from physical sciences to economics. The lab offers a controlled environment where subjects can be randomly assigned to different “states of the world”, therefore allowing for the creation of proper counterfactuals. The random assignment of subjects to treatment and control groups (or different treatments) guarantees, with large enough samples, to achieve a situation where groups are on average identical under all aspects but the particular treatment. This means that all observed difference in the outcome of interest can be causally attributed to the treatment features, ruling out any other confounding factor. However, a critical assumption underlying the interpretation of data from many laboratory experiments is that the insights gained in the lab can be generalizable beyond the lab. Contrary to lab findings in physical disciplines, findings on human behaviour may be influenced by many factors which systematically differentiate the laboratory context from the real world. In order to provide more natural settings and external validity for laboratory experiments, economists started to conduct experiments on the field, by using randomization in naturally-occurring settings. From the early 2000’s we have observed a surge of field experiments or randomized controlled trials (RCTs), in developing and developed countries. The fortune of field experiments has lied in two key factors: a new emphasis for evidence-based policy-making, under the push of international organizations like the World Bank and economists’ need to find credible sources of exogenous variation to validate their theory in the real world.
The use of field experiments for the impact evaluation of policies and programs has dramatically risen in the last decade, particularly in the international development sector. This has been pushed forward by large institutions such as the World Bank, donors such as DFID and Bill&Melinda Gates Foundation with the aim of justifying their spending and measuring the effective impact of their projects. The introduction of rigorous impact evaluation through RCTs has proved to be beneficial for the improvement of individual projects but also, more in general, has contributed to a body of knowledge that has changed the way institutions design development programs. Policy evaluations through RCTs allow to rigorously assess what works and what does not and provide information on the cost-effectiveness of different options aiming at the same goal. Rigorous impact evaluation of programs also supports implementing agencies to better clarify the theory of change embedded in their intervention, by focusing on causality and measurement of the process moving from inputs to impacts. Important policy questions have been tackled through the use of RCTs, among them the effect of school inputs on learning, the adoption of new technologies in agriculture, incentive schemes for workers, conditional and unconditional cash transfer programs, microcredit, etc.
An issue that often rises with program evaluations, similarly to lab experiments, is the extent to which the results are replicable and generalizable to other context. Would a program run in rural Benin have the same impacts of a similar program implemented in the Andean communities? Or, in other words, what can be generally learnt from the impacts of a specific program, in a specific context? To extend the external validity of impact evaluations, researchers have been focusing on the representativeness of situations instead of the representativeness of contexts, by exploring the mechanisms underlying the impacts. In other words, they tried to focus on the forces driving the particular change, which pertain to human economic behavior and not just of the particular context under study. More and more standard “program evaluations” have moved beyond the simple question of whether a particular program works or not and have evolved into “field experiments”, allowing to test economic theories. This certainly has represented a unique opportunity for the stereotyped academia to descend from the “ivory tower“, to get its hand dirty in the complexity of reality and implementation, in order to support the policy decision-making more directly. This has led to an explosion of such work in several strands of the economic literature.
The two main priorities in carrying out economic research mentioned at the beginning seem to be met with field experiments: nice exogenous variation, obtained through RCTs, to be used to prove economic theory and the possibility to influence the policy decision-making, through the evidence generated. That’s a dream! A typical situation where everybody wins: economists get their empirical evidence to prove theory, policy-makers get advices on how to maximize the impacts of their programs.
However, as in all dreams, the alarm may ring at some time. In order to make a step further in considering the role of field experiments in economics and policy-making, some facts need to be considered.
First, the incentive structure of researchers and of program implementers is not the same. Researchers are mostly interested in exploring general economic relationships and forces, human behaviour, production functions, etc., while the implementers mostly want to know whether the optimal number of students per class is 25 or 30, whether an HIV-AIDS awareness campaigns reduces its incidence, etc. One can certainly say that there will always be some room for learning something general by investigating the mechanisms, beyond the mere impact evaluation. However, at some point, there may be a “marginal” project which one institution is willing to evaluate, but which is not enough promising to be considered by any researcher.
Second, running impact evaluations (or field experiments) through RCTs is costly. Who should pay for it? Currently, a minority of implementing institutions are using their own money to do it. Despite the scope for internal learning and improved cost-effectiveness, in many contexts (at least in Europe and definitely in Italy) showing rigorous impacts and effectiveness of interventions does not pay-off in terms of signaling the quality of projects in order to attract more funds. The current incentive structure does not seem to reward impact evaluation as such, yet. The majority of institutions carrying out impact evaluations seem to use external funding made available for that purpose from international donors and foundation. However, such financial resources are usually conveyed through researchers who are normally the ones writing grant proposals.
So here comes a problem: there might be or will be projects out there of extremely high “policy interest” and extremely low “research interest” which will go under-researched and whose evaluation will not be financed. If we all believe in the moral imperative that public money should be spent in the best way to have impact, then something has to change. Certainly, a stronger support of evidence-based policy making from the top (international institutions, governments, donors), which may better reward agencies and institutions running impact evaluation, is desirable. This should further stimulate institutions to run impact evaluation on their core projects, possibly incentivizing them to devote some of the internal resources. Then, a market of professionals specialized in impact evaluation, not necessarily linked to academic performance (in terms of contribution to the economic literature) but still capable to manage the whole toolbox of impact evaluation methods and to adapt it to the specific context, to the limits and needs of the implementer may probably arise. That would still be a good opportunity for social scientists to help the development of the society! Overall, we would reach an even nicer dream where:
- academics run field experiments on “research promising” programs and progress in the knowledge of human behaviours, institutions and markets, and in the understanding the mechanisms behind the effectiveness (or lack thereof) of some policies.
- all institutions are incentivized to invest their own money in order to learn something on the way to be more effective and efficient in the pursuit of their goals
- a lot of new jobs for economists and impact evaluators!!!
PS: This piece really benefitted from the attendance to the EAERE-FEEM-VIU European Summer School on Field Experiments in Environmental and Resource Economics and the 3ie-IFAD Workshop “Designign high-quality impact assessments” and from interesting conversations with Daniel Stein, Roberto Barbieri, Nicolò Tomaselli, Federico Bastia, Giovanna D’Adda, Mariapia Mendola