class: center, middle, inverse, title-slide .title[ # 7: Synthetic Controls, Sensitivity Analysis and Bounds ] .subtitle[ ## Quantitative Causal Inference ] .author[ ###
J. Seawright
] .institute[ ###
Northwestern Political Science
] .date[ ### May 15, 2025 ] --- class: center, middle <style type="text/css"> pre { max-height: 400px; overflow-y: auto; } pre[class] { max-height: 200px; } </style> --- ### Manufacture Perfect Comparisons Strictly speaking, for the Method of Difference to work based on a comparison between cases 1 and 2, the condition which must be met is: `$$\begin{aligned} Y_{1,t} = Y_{2,t}\\ Y_{1,c} = Y_{2,c}\end{aligned}$$` --- ### Manufacture Perfect Comparisons No observable condition can ever guarantee that this assumption is met, but if we were to find two cases that *exactly* match on a suitably rich set of background variables `\(\mathbb{Z}\)`, perhaps, we would come close to believing the assumption. --- ### Manufacture Perfect Comparisons Unfortunately, if `\(\mathbb{Z}\)` is indeed a reasonably deep list of variables, we are unlikely to find cases that in fact exactly match (or even come particularly close). --- ### Manufacture Perfect Comparisons Abadie and collaborators suggest that we create our own "synthetic" control cases, by averaging together existing control cases to come as close as possible to exactly matching the treatment case on `\(\mathbb{Z}\)`. --- ### Synthetic Control The setup is one in which there are `\(N\)` cases, each of which is observed at multiple time periods labeled from 1 through `\(T\)`. --- ### Synthetic Control Each case has a treatment and control potential outcome for each time period. The difference between these is: `\(\alpha_{i,t} = Y^T_{i,t} - Y^C_{i,t}\)` --- ### Synthetic Control Suppose that the treatment of interest happens in *one* case at one time period. That is, `\(D_{j,t}=0\)` for all `\(j \neq i\)` and for all `\(t < t_{treat}\)`. --- ### Synthetic Control Let `\(X_{i} = (\mathbb{Z}_{i}, Y_{i})\)` Minimize `\(X_{i, t} - \mathbb{X}_{j, t} W\)` for `\(t < t_{treat}\)`. --- ### Synthetic Control `\(\sqrt{(X_{i, t} - \mathbb{X}_{j, t} W)^{T} V (X_{i, t} - \mathbb{X}_{j, t} W)}\)` --- <img src="images/choosingv.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth2.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth3.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth4.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth5.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth6.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth7.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth8.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/chavezsynth9.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/DeKadt1.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/DeKadt2.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/DeKadt3.png" width="90%" style="display: block; margin: auto;" /> --- <img src="images/DeKadt4.png" width="90%" style="display: block; margin: auto;" /> --- ``` r library(Synth) ``` ``` ## ## ## ## Synth Package: Implements Synthetic Control Methods. ``` ``` ## ## See https://web.stanford.edu/~jhain/synthpage.html for additional information. ``` --- ``` r afripanel <- read.csv("https://github.com/jnseawright/PS406/raw/main/data/afripanel.csv") malidata <- afripanel[afripanel$WBCode=="MLI" | afripanel$cont_dem_ind==1,] malicontrols <- unique(malidata$WBCode[malidata$WBCode!="MLI"&malidata$WBCode!="ETH"&malidata$WBCode!="SDN"]) ``` --- ``` r mali.prep <- dataprep( foo=malidata, predictors=c( "lngdpmadlag", "lngdpmadlag2", "lngdpmadlag3", "lngdpmadlag4", "lnpop", "ki", "openk", "civwar", "civwarend", "pwt_xrate", "pwt_xrate_lag1", "pwt_xrate_lag2", "pwt_xrate_lag3", "eximdiff", "eximdiff_lag1", "eximdiff_lag2", "wbank", "wbank_lag1", "wbank_lag2", "imfadj", "imfadj_lag1", "imfadj_lag2" ), dependent="lngdpmad", unit.variable="wbcode2", time.variable="year", treatment.identifier="MLI", controls.identifier=malicontrols, time.predictors.prior=c(1975:1990), time.optimize.ssr=c(1975:1991), time.plot=c(1975:2008), unit.names.variable="WBCode" ) ``` ``` ## ## Missing data- treated unit; predictor: eximdiff ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag1 ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag1 ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag1 ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag1 ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag1 ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag1 ; for period: 1980 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1980 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data- treated unit; predictor: eximdiff_lag2 ; for period: 1981 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag1 ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag1 ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag1 ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag1 ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag1 ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag1 ; for period: 1980 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag1 ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag1 ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag1 ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag1 ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag1 ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag1 ; for period: 1980 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1980 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 2 ; predictor: eximdiff_lag2 ; for period: 1981 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag2 ; for period: 1975 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag2 ; for period: 1976 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag2 ; for period: 1977 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag2 ; for period: 1978 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag2 ; for period: 1979 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ## ## Missing data - control unit: 11 ; predictor: eximdiff_lag2 ; for period: 1980 ## We ignore (na.rm = TRUE) all missing values for predictors.op. ``` --- ``` r mali.synth <- synth(mali.prep) ``` ``` ## ## X1, X0, Z1, Z0 all come directly from dataprep object. ## ## ## **************** ## searching for synthetic control unit ## ## ## **************** ## **************** ## **************** ## ## MSPE (LOSS V): 0.003197622 ## ## solution.v: ## 0.06551271 0.06737548 0.06997086 0.07378234 0.0147838 9.20567e-05 6.44798e-05 0.1727371 0.004874032 0.07058757 0.07264543 0.07458472 0.07624927 0.0001034037 0.001844508 0.03313834 0.03153702 0.0498886 0.0186926 0.1015321 2.0569e-06 1.5256e-06 ## ## solution.w: ## 1.443e-06 0.4289425 0.1957719 6.41e-08 0.0001086331 4.38305e-05 2.826e-07 0.001898058 4.4754e-06 6.0236e-06 0.08095075 0.007983987 1.00913e-05 4.066e-06 0.06042418 0.2238245 2.52606e-05 ``` --- ``` r path.plot(synth.res=mali.synth, dataprep.res=mali.prep, Ylab="Log GDP per capita", Legend=NA, tr.intake=1991, Ylim=c(6,8) , Main="Mali" ) ``` <img src="7synthetic_files/figure-html/unnamed-chunk-20-1.png" width="60%" style="display: block; margin: auto;" /> --- ### Sensitivity Analysis Informal talk: just run a bunch of models. Advantage: we can see a family of results! Disadvantages: How systematic was the collection? Was every result actually reported? How is this different from specification searching/p hacking? --- ### Sensitivity Analysis More formal ideas: 1. Ensemble models 2. Rosenbaum bounds --- ### Sensitivity Analysis One approach to sensitivity analysis is to average all the credible models. But why a simple average? --- ### Ensemble Model 1. Fit each credible model 2. Form fitted values for each case from each model 3. Run an ensemble model that predicts the outcome based on those fitted values, which assigns each model a weight --- ### Ensemble Model Does this help for causal inference? --- ### Ensemble Model Athey et al. 2019 propose using an ensemble model to construct synthetic controls, and show that it can outperform single-model methods. --- ### Ensemble Model An alternative approach can be to build two ensemble models, one for the cases in the treatment group and one for cases in the control group. Then run *all* cases through both models and use differences in fitted values to estimate causal effects. --- ### Rosenbaum Bounds Consider a single matched pair of cases in a matching study, or a joined treatment and control case in a stratified experiment. These individuals *should* have the same propensity score. If there is confounding, will they? --- ### Rosenbaum Bounds Let's use `\(\Gamma\)` to represent the degree to which confounding distorts random assignment. So if `\(\Gamma = 2\)`, then confounding is going to lead one case to be twice as likely to receive the treatment as the other. How much would the causal inference shift if `\(\Gamma\)` could be as big as 2? --- ### Rosenbaum Bounds It turns out that, for most methods, there is an upper and a lower bound on the possible causal inference for any given level of `\(\Gamma\)` and the existing data, called the Rosenbaum Bounds. Exploring these tells us how much confounding there would have to be in order to produce a large change in our causal inference. --- ``` r library(sensemakr) ``` ``` ## Warning: package 'sensemakr' was built under R version 4.4.3 ``` ``` ## See details in: ``` ``` ## Carlos Cinelli and Chad Hazlett (2020). Making Sense of Sensitivity: Extending Omitted Variable Bias. Journal of the Royal Statistical Society, Series B (Statistical Methodology). ``` --- ``` r perulm <- lm(outsidervote ~ simpletreat, data=peruemotions) peru.sens1 <- sensemakr(perulm, "simpletreat") ``` --- ``` r peru.sens1 ``` ``` ## Sensitivity Analysis to Unobserved Confounding ## ## Model Formula: outsidervote ~ simpletreat ## ## Null hypothesis: q = 1 and reduce = TRUE ## ## Unadjusted Estimates of ' simpletreat ': ## Coef. estimate: 0.11763 ## Standard Error: 0.04962 ## t-value: 2.3706 ## ## Sensitivity Statistics: ## Partial R2 of treatment with outcome: 0.01239 ## Robustness Value, q = 1 : 0.1059 ## Robustness Value, q = 1 alpha = 0.05 : 0.01886 ## ## For more information, check summary. ``` --- ``` r plot(peru.sens1) ``` <img src="7synthetic_files/figure-html/unnamed-chunk-24-1.png" width="60%" style="display: block; margin: auto;" /> --- ``` r perulm2 <- lm(risk ~ simpletreat + edlevel + classid + leftright + age + I(age^2), data=peruemotions) peru.sens2 <- sensemakr(perulm2, "simpletreat", benchmark_covariates="age") ``` --- ``` r plot(peru.sens2) ``` <img src="7synthetic_files/figure-html/unnamed-chunk-26-1.png" width="60%" style="display: block; margin: auto;" /> --- ### Manski Bounds Consider a situation with a dichotomous treatment and a dichotomous outcome. --- ### Manski Bounds Before we do anything else, the possible average treatment effect in such a scenario is known to be somewhere between -1 and 1, inclusive. --- ### Manski Bounds Every individual case has one of three specific treatment effects: -1, 0, or 1. --- ### Manski Bounds Consider: `\(Y_{i, t}=0\)` `\(Y_{i, c}=1\)` Here, the causal effect is -1. (Notice that a case in the treatment group with an outcome of zero either has a causal effect of 0 or -1, but cannot have a causal effect of 1. Likewise, a control case with an outcome of one!) --- ### Manski Bounds Alternatively: `\(Y_{i, t}=1\)` `\(Y_{i, c}=0\)` Here, the causal effect is 1. (Notice that cases with these observed pairs of outcomes could have a causal effect of either 1 or 0, but cannot have -1.) --- ### Manski Bounds For every case that we observe, we rule out one of the three possible causal effects. This information, *by itself*, puts some modest limits on what the average treatment effect can be. --- ### Manski Bounds Manski and Nagin look at the average treatment effect: `$$ATE = E(Y_{t}) - E(Y_{c})$$` `$$ATE = E(Y_{t} | X=t)P(X=t) + E(Y_{t} | X=c)P(X=c)\\ - E(Y_{c}| X=t)P(X=t) - E(Y_{c}| X=c)P(X=c)$$` --- ### Lower Bound `$$ATE = E(Y_{t} | X=t)P(X=t) + 0 \\ - 1 - E(Y_{c}| X=c)P(X=c)$$` --- ### Upper Bound `$$ATE = E(Y_{t} | X=t)P(X=t) + 1 \\ - 0 - E(Y_{c}| X=c)P(X=c)$$` --- ``` r library(ATbounds) ``` ``` ## Warning: package 'ATbounds' was built under R version 4.4.3 ``` --- ``` r nswre_treated <- read.table("http://users.nber.org/~rdehejia/data/nswre74_treated.txt") colnames(nswre_treated) <- c("treat","age","edu","black","hispanic", "married","nodegree","RE74","RE75","RE78") nswre_control <- read.table("http://users.nber.org/~rdehejia/data/nswre74_control.txt") colnames(nswre_control) <- c("treat","age","edu","black","hispanic","married","nodegree","RE74","RE75","RE78") nswre <- rbind(nswre_treated,nswre_control) D <- nswre$treat Y <- (nswre$RE78 > 0) X <- with(nswre, cbind(age,edu,black,hispanic,married,nodegree,RE74/1000,RE75/1000)) model <- lm(Y ~ D) ``` --- ``` r summary(model) ``` ``` ## ## Call: ## lm(formula = Y ~ D) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.7568 -0.6462 0.2432 0.3538 0.3538 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.64615 0.02849 22.679 <2e-16 *** ## D 0.11060 0.04419 2.503 0.0127 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.4594 on 443 degrees of freedom ## Multiple R-squared: 0.01394, Adjusted R-squared: 0.01172 ## F-statistic: 6.265 on 1 and 443 DF, p-value: 0.01267 ``` --- ``` r rps <- rep(mean(D),length(D)) bns_nsw <- atebounds(Y, D, X, rps, Q=1) summary(bns_nsw) ``` ``` ## ATbounds: ATE ## Call: atebounds(Y = Y, D = D, X = X, rps = rps, Q = 1) ## Confidence Level: 0.95 ## Lower_Bound Upper_Bound ## Bound Estimate -0.45843 0.50112 ## Confidence Interval -0.74966 0.77878 ``` --- ``` r bns_nsw2 <- atebounds(Y, D, X, rps, Q=2) summary(bns_nsw2) ``` ``` ## ATbounds: ATE ## Call: atebounds(Y = Y, D = D, X = X, rps = rps, Q = 2) ## Confidence Level: 0.95 ## Lower_Bound Upper_Bound ## Bound Estimate 0.083059 0.10125 ## Confidence Interval -0.014176 0.21192 ``` --- ### Manski Bounds - Ecological inference - Missing data