Title: Causal inference with spatio-temporal data: Estimating the effects of airstrikes on insurgent violence in Iraq
Authors and Year: Georgia Papadogeorgou, Kosuke Imai, Jason Lyall, Fan Li (2022)
Journal: Journal of the Royal Statistical Society, Series B (Statistical Methodology) (DOI: 10.1111/rssb.12548)
“Correlation is not causation”, as the saying goes, yet sometimes it can be, if certain assumptions are met. Describing those assumptions and developing methods to estimate causal effects, not just correlations, is the central concern of the causal inference field. Broadly speaking, causal inference seeks to measure the effect of a treatment on an outcome. This treatment can be an actual medicine or something more abstract like a policy. Much of the literature in this space focuses on relatively simple treatments/outcomes and uses data which doesn’t exhibit much dependency. As an example, clinicians often want to measure the effect of a binary treatment (received the drug or not) on a binary outcome (developed the disease or not). The data used to answer such questions is typically patient-level data where the patients are assumed to be independent from each other. To be clear, these simple setups are enormously useful and describe commonplace causal questions.
Yet there are increasingly complicated causal questions which require more advanced methods. The motivating example of the paper discussed here is one such situation. The authors would like to measure the effect of different airstrike strategies in Iraq on insurgent violence, which is considerably more complex than many typical causal inference problems. Most notably, both the treatment (airstrikes) and the outcome (insurgent violence) fluctuate over both time and space. This temporal and spatial variability over the period of study (January 2007 to July 2008) is shown in Figure 1 at the bottom of this article. Such complex dependencies often result in effects that are difficult to model, such as spatial spillover and temporal carryover effects. Spatial spillover refers to the fact that treatment in one area can have an impact on other regions, or in the context of Iraq, an airstrike in Baghdad may have an effect on insurgent violence in Mosul. Similarly, carryover effects mean that treatment today may not only affect tomorrow, but could also impact any point in the future. Thus, a key part of the methodology developed by the authors is to allow for arbitrary spillover and carryover effects, an area where previous methods fell short.
The ability to model such effects comes largely from the idea of modeling the treatment (airstrikes) as stochastic, i.e., random. This differs from the more typical deterministic setting, where the decision to give a treatment is decided ahead of time for each unit of measurement (in this case, a location and time). However, there are infinite numbers of possible locations for airstrikes to land, so the treatment is instead thought of as random. Thus, a ‘treatment’ really refers to a military strategy which controls the spatial distribution and intensity of airstrikes at a given time. This stochastic intervention concept allows the authors to answer a variety of causal questions. For example, they could study how a change in intensity of airstrikes, keeping spatial distribution the same, can impact the outcome. Similarly, they could keep the same intensity of airstrikes, but change the distribution, and look at the resulting effect. In all of these scenarios, the causal effect is typically an estimate of the expected number of insurgent attacks over a region of interest.
Clearly the methodology developed has the power to uncover many facets of a causal problem, which leads to the question of what the authors ended up discovering about the airstrike and insurgent violence relationship. Their empirical analysis led to three overarching conclusions. First, increasing the intensity of airstrikes leads to more insurgent violence. Second, they find there is about a week lag in the effect, which means an increase of airstrikes seven days ago would have an impact on the insurgent violence of today. Finally, they discover that emphasizing Baghdad in airstrikes leads to more insurgent attacks in Mosul, thus uncovering a displacement effect between airstrike distribution and attacks.
The causal inference approach developed in this paper leads to fascinating findings about the airstrike and insurgent violence relationship in Iraq, which will undoubtedly continue to be the source of much discussion in political science. Furthermore, the method itself could be utilized in many other applications—wherever the data takes the form of a time series of maps, for example when studying disease spread over time in a given geographical region. Of course, no method is perfect, and all rely on some assumptions. In this case, a key assumption is that all confounding variables (variables which affect both the treatment and the outcome) are included in the model. If an unmeasured confounder exists, the results uncovered may not be valid. Nevertheless, this research marks a large step forward in performing robust causal inference for spatiotemporal data by modeling the treatment as stochastic (allowing for arbitrary spatial spillover and temporal carryover effects) as well as quantifying a variety of valuable causal estimands. Interested readers wanting to dive deeper into the methodology presented are encouraged to check out the full article published in JRSS-B or listen to the first author discuss the material at the Online Causal Inference Seminar.
Figure 1: The top row shows the number of airstrikes over time (a) as well as the spatial spread of said airstrikes (b). The bottom row plots show the same information for insurgent attacks.