2: Regression for Causal Inference

class: center, middle, inverse, title-slide

.title[
# 2: Regression for Causal Inference
]
.subtitle[
## Quantitative Causal Inference
]
.author[
### <large>J. Seawright</large>
]
.institute[
### <small>Northwestern Political Science</small>
]
.date[
### April 10, 2025
]

---

class: center, middle

---
### In-Seminar Presentation

Sign up in teams of two for a pre-set list of topics and dates [here](https://docs.google.com/spreadsheets/d/13UnKiGDod1opceuVEmFPkc-3XliXoCGghpQWTeJ9nhA/edit?gid=0#gid=0).

---
### Unconfounded Assignment

1.  `$Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1}) = Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}^{'}_{0}, \mathbb{Y}^{'}_{1})$`

---
### Regression for Causal Inference

`$$\begin{aligned}
Y_{i} & = E(Y_{i,c}) + D_{i} \{E(Y_{i,t}) - E(Y_{i,c})\} + \\
     & [Y_{i,c} - E(Y_{i,c})] + \\
     & D_{i} ([Y_{i,t} - E(Y_{i,t})] - [Y_{i,c} - E(Y_{i,c})]) \\
     & = \mu_{0} + D_{i} (\mu_{1} - \mu_{0}) + \{\nu_{0} + D_{i} (\nu_{1} - \nu_{0})\}\end{aligned}$$`

---
### Regression for Causal Inference

Imagine that there is a single confounding variable, `$X$`, which is
dichotomous.

---
### Regression for Causal Inference

What can we say about a bivariate regression of `$Y$` on `$D$` only for
cases with `$X=0$`?

---
### Regression for Causal Inference

`$$\begin{aligned}
(Y_{i}| X_{i}=0) = \\
(\mu_{0} + D_{i} (\mu_{1} - \mu_{0}) + \{\nu_{0} + D_{i} (\nu_{1} - \nu_{0})\}|X_{i}=0)\end{aligned}$$`

---
### Regression for Causal Inference

What about this regression?

`$$\begin{aligned}
Y_{i} & = \beta_{0} + D_{i} \beta_{1} + X_{i} \beta_{2} + X_{i} D_{i} \beta_{3} + \epsilon_{i}\end{aligned}$$`

---
### Regression for Causal Inference

`$$\begin{aligned}
        Y_{i: X_{i} = 0} & = \beta_{0} + D_{i: X_{i} = 0} \beta_{1} + \epsilon_{i: X_{i} = 0}
        \end{aligned}$$`

---
### Regression for Causal Inference

`$$\begin{aligned}
        Y_{i: X_{i} = 1} & = \beta_{2} + D_{i: X_{i} = 1} \beta_{3} + \delta_{i: X_{i} = 1}
        \end{aligned}$$`

---
### Regression for Causal Inference

`$$\begin{aligned}
        Y_{i} = & (X_{i}) (\beta_{2} + D_{i} \beta_{3} + \delta_{i}) + \\
              & (1 - X_{i})(\beta_{0} + D_{i} \beta_{1} + \epsilon_{i})
        \end{aligned}$$`

---
### Regression for Causal Inference

`$$\begin{aligned}
        Y_{i} = & \beta_{0} + D_{i} \beta_{1} + (X_{i} \beta_{2} - X_{i} \beta_{0}) + \\
                  & X_{i} (D_{i} \beta_{3} - D_{i} \beta_{1}) + \\
        & \epsilon_{i} + X_{i} (\delta_{i} - \epsilon_{i})
                \end{aligned}$$`

---
### Regression for Causal Inference

-   A collection of control variables `$\mathbb{X}$` will allow regression
    to produce an unbiased estimate of `$(\mu_{1} - \mu_{0})$` when:

1.  `$D$` is uncorrelated with
        `$\{\nu_{0} + D_{i} (\nu_{1} - \nu_{0})\}$` within each group
        defined by `$\mathbb{X}$`, and

2.  the residual causal effect is not correlated with `$\mathbb{X}$`,
        and

3.  a fully flexible parameterization of `$\mathbb{X}$` and `$D$` is used.

---
### Regression for Causal Inference

-   These conditions imply that:

1.  No element of `$\mathbb{X}$` is on any causal path from
        `$\mathbf{D}$` to `$\mathbf{y}$`, and

2.  no element of `$\mathbb{X}$` is caused by `$\mathbf{D}$` or any of
        its unmeasured causes *and* `$\mathbf{y}$` or some other cause of it,

3.  all causes of `$\mathbf{D}$` that are also causes of `$\mathbf{y}$`
        have some element of `$\mathbb{X}$` somewhere on the causal path
        from the unmeasured initial cause to either `$\mathbf{D}$` or
        `$\mathbf{y}$`,

---
### Regression for Causal Inference

-   These conditions imply that:

4\.  no element of `$\mathbb{X}$` causes `$\mathbf{D}$` without also having an independent causal pathway to `$\mathbf{y}$`.

---
<img src="images/goodcontrol.jpg" width="90%" />

---
<img src="images/irrelevantcontrol1.jpg" width="90%" />

---
<img src="images/harmfulcontrol1.jpg" width="90%" />

---
<img src="images/harmfulcontrol2.jpg" width="90%" />

---
<img src="images/harmfulcontrol3.jpg" width="90%" />

---
<img src="images/luputitleauthor.JPG" width="90%" />

---
<img src="images/lupu1.JPG" width="90%" />

---
<img src="images/lupu2.JPG" width="90%" />

---
<img src="images/lupu3.JPG" width="90%" />

---
<img src="images/lupu4.JPG" width="70%" />

---

``` r
library(dagitty)
```

---

``` r
LupuPeisahkinDAG1 <- dagitty( "dag {Dekulakized -> Victimization PreSovietWealth -> Victimization SovietOpposition -> Victimization PreSovietReligiosity -> Victimization PriorRegion -> Victimization DeportationRegion -> Victimization DeportationRegion -> Religiosity PriorRegion -> Religiosity PreSovietReligiosity -> Religiosity SovietOpposition -> Religiosity PreSovietWealth -> Religiosity Victimization -> Religiosity Dekulakized -> Religiosity}" )
```

---

``` r
plot( LupuPeisahkinDAG1 )
```

```
## Plot coordinates for graph not supplied! Generating coordinates, see ?coordinates for how to set your own.
```

---

``` r
LupuPeisahkinDAG2 <- dagitty( "dag {Dekulakized <- Victimization PreSovietWealth -> Victimization SovietOpposition -> Victimization PreSovietReligiosity -> Victimization PriorRegion -> Victimization DeportationRegion <- Victimization DeportationRegion -> Religiosity PriorRegion -> Religiosity PreSovietReligiosity -> Religiosity SovietOpposition -> Religiosity PreSovietWealth -> Religiosity Victimization -> Religiosity Dekulakized -> Religiosity}" )
```

---

``` r
plot( LupuPeisahkinDAG2 )
```

```
## Plot coordinates for graph not supplied! Generating coordinates, see ?coordinates for how to set your own.
```

---

``` r
adjustmentSets(LupuPeisahkinDAG1, exposure = "Victimization", outcome = "Religiosity")
```

```
## { Dekulakized, DeportationRegion, PreSovietReligiosity,
##   PreSovietWealth, PriorRegion, SovietOpposition }
```

``` r
adjustmentSets(LupuPeisahkinDAG2, exposure = "Victimization", outcome = "Religiosity")
```

```
## { PreSovietReligiosity, PreSovietWealth, PriorRegion, SovietOpposition
##   }
```

---

``` r
instrumentalVariables(LupuPeisahkinDAG1, exposure = "Victimization", outcome = "Religiosity")

instrumentalVariables(LupuPeisahkinDAG2, exposure = "Victimization", outcome = "Religiosity")
```

---

``` r
impliedConditionalIndependencies(LupuPeisahkinDAG1)
```

```
## Dklk _||_ DprR
## Dklk _||_ PrSR
## Dklk _||_ PrSW
## Dklk _||_ PrrR
## Dklk _||_ SvtO
## DprR _||_ PrSR
## DprR _||_ PrSW
## DprR _||_ PrrR
## DprR _||_ SvtO
## PrSR _||_ PrSW
## PrSR _||_ PrrR
## PrSR _||_ SvtO
## PrSW _||_ PrrR
## PrSW _||_ SvtO
## PrrR _||_ SvtO
```

---

``` r
library(tidyverse)
```

```
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.3     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.1
## ✔ readr     2.1.5     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ nlme::collapse() masks dplyr::collapse()
## ✖ dplyr::filter()  masks stats::filter()
## ✖ dplyr::lag()     masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
```

---

``` r
qog_std_ts_jan22 <- read_csv("https://github.com/jnseawright/PS406/raw/main/data/qog_std_ts_jan22.csv")
```

```
## Warning: One or more parsing issues, call `problems()` on your data frame for details,
## e.g.:
##   dat <- vroom(...)
##   problems(dat)
```

```
## Rows: 15168 Columns: 1915
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr    (7): cname, cname_qog, ccodealp, version, cname_year, ccodealp_year, ...
## dbl (1905): ccode, year, ccode_qog, ccodecow, aid_cpnc, aid_cpsc, aid_crnc, ...
## lgl    (3): psi_cpsi2, psi_edate2, psi_psi2
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
```

``` r
qogdems2000 <- qog_std_ts_jan22 %>% 
  filter(year==2000 & vdem_libdem > 0.5)

qog2000demsin2020 <- qog_std_ts_jan22 %>% 
  filter(year==2020 & cname %in% qogdems2000$cname)

preslm <- lm(vdem_libdem ~ br_pres, data=qog2000demsin2020)
```

---

``` r
summ(preslm)
```

---

``` r
preslm2 <- lm(vdem_libdem ~ br_pres + as.factor(ht_colonial) +  ccp_systyear + br_pvote, data=qog2000demsin2020)
```

---

``` r
summ(preslm2)
```

<table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;">
<tbody>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Observations </td>
   <td style="text-align:right;"> 58 (4 missing obs. deleted) </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Dependent variable </td>
   <td style="text-align:right;"> vdem_libdem </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Type </td>
   <td style="text-align:right;"> OLS linear regression </td>
  </tr>
</tbody>
</table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;">
<tbody>
  <tr>
   <td style="text-align:left;font-weight: bold;"> F(9,48) </td>
   <td style="text-align:right;"> 3.30 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> R² </td>
   <td style="text-align:right;"> 0.38 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Adj. R² </td>
   <td style="text-align:right;"> 0.27 </td>
  </tr>
</tbody>
</table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;border-bottom: 0;">
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Est. </th>
   <th style="text-align:right;"> S.E. </th>
   <th style="text-align:right;"> t val. </th>
   <th style="text-align:right;"> p </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;font-weight: bold;"> (Intercept) </td>
   <td style="text-align:right;"> 2.50 </td>
   <td style="text-align:right;"> 0.76 </td>
   <td style="text-align:right;"> 3.29 </td>
   <td style="text-align:right;"> 0.00 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> br_pres </td>
   <td style="text-align:right;"> 0.03 </td>
   <td style="text-align:right;"> 0.04 </td>
   <td style="text-align:right;"> 0.76 </td>
   <td style="text-align:right;"> 0.45 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> as.factor(ht_colonial)1 </td>
   <td style="text-align:right;"> -0.31 </td>
   <td style="text-align:right;"> 0.14 </td>
   <td style="text-align:right;"> -2.26 </td>
   <td style="text-align:right;"> 0.03 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> as.factor(ht_colonial)2 </td>
   <td style="text-align:right;"> -0.09 </td>
   <td style="text-align:right;"> 0.07 </td>
   <td style="text-align:right;"> -1.42 </td>
   <td style="text-align:right;"> 0.16 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> as.factor(ht_colonial)5 </td>
   <td style="text-align:right;"> -0.12 </td>
   <td style="text-align:right;"> 0.05 </td>
   <td style="text-align:right;"> -2.28 </td>
   <td style="text-align:right;"> 0.03 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> as.factor(ht_colonial)6 </td>
   <td style="text-align:right;"> -0.24 </td>
   <td style="text-align:right;"> 0.11 </td>
   <td style="text-align:right;"> -2.26 </td>
   <td style="text-align:right;"> 0.03 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> as.factor(ht_colonial)7 </td>
   <td style="text-align:right;"> -0.13 </td>
   <td style="text-align:right;"> 0.08 </td>
   <td style="text-align:right;"> -1.59 </td>
   <td style="text-align:right;"> 0.12 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> as.factor(ht_colonial)9 </td>
   <td style="text-align:right;"> -0.07 </td>
   <td style="text-align:right;"> 0.13 </td>
   <td style="text-align:right;"> -0.49 </td>
   <td style="text-align:right;"> 0.63 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> ccp_systyear </td>
   <td style="text-align:right;"> -0.00 </td>
   <td style="text-align:right;"> 0.00 </td>
   <td style="text-align:right;"> -2.36 </td>
   <td style="text-align:right;"> 0.02 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> br_pvote </td>
   <td style="text-align:right;"> 0.04 </td>
   <td style="text-align:right;"> 0.05 </td>
   <td style="text-align:right;"> 0.95 </td>
   <td style="text-align:right;"> 0.35 </td>
  </tr>
</tbody>
<tfoot><tr><td style="padding: 0; " colspan="100%">
<sup></sup> Standard errors: OLS</td></tr></tfoot>
</table>

---
### How Much Does It Hurt to Be Wrong?

Suppose we know we can get a good causal inference from:

`$$Y_{i}=\beta_{1}+\beta_{2}D_{i}+\beta_{3}X_{i}+\beta_{4}W_{i}+\epsilon_{i}$$`
---

But instead we estimate:

`$$Y_{i}=\beta^{*}_{1}+\beta^{*}_{2}D_{i}+\beta^{*}_{3}X_{i}+\epsilon^{*}_{i}$$`
---

`$$E[\hat{\beta^*}_{2}]=\beta_{2}+\beta_{4} b_{4,2}$$`
`$$b_{4,2} = \frac{(r_{4,2}−r_{3,2}r_{4,3})}{1-r_{3,2}^2}\sqrt{\frac{V_{4}}{V_{2}}}$$`

---

What if we estimate:

`$$Y_{i}=\beta^{+}_{1}+\beta^{+}_{2}D_{i}+\epsilon^{+}_{i}$$`
---

`$$E[\hat{\beta^+}_{2}]=\beta_{2}+\beta_{3} r_{2,3}\sqrt{\frac{V_{4}}{V_{2}}} + \beta_{4} r_{2,4}\sqrt{\frac{V_{4}}{V_{3}}}$$`
---

Is `$$E[\hat{\beta^+}_{2}]$$` closer to zero or further away than `$$E[\hat{\beta^*}_{2}]$$`?

---
<img src="images/Clarke1.png" width="90%" />

---
<img src="images/Clarke2.png" width="90%" />

---
### Regression for Causal Inference

-   If the causal effect is not constant across all cases, regression
    will not give a consistent estimate of the average treatment effect.

-   Instead, it estimates a covariance-adjusted weighted average of
    cases' treatment effects.

---
### Aronow and Samii 2016

---
### Aronow and Samii 2016

---
### Aronow and Samii 2016

---
### Chattopadhyay and Zubizarreta 2023

---
### Chattopadhyay and Zubizarreta 2023

`$$\hat{\tau}_{OLS} = \sum_{i:D_{i}=1} w_{i} Y_{i} - \sum_{i:D_{i}=0} w_{i} Y_{i}$$`

---
### Chattopadhyay and Zubizarreta 2023

There is a different set of weights called the ''multi-regression imputation'' or MRI:

`$$\hat{\tau}_{MRI} = \sum_{i:D_{i}=1} w^{MRI}_{i}(\bar{X}) Y_{i} - \sum_{i:D_{i}=0} w^{MRI}_{i}(\bar{X}) Y_{i}$$`

`$$w^{MRI}_{i}(x) = n^{−1}_{T} + (X_{i} − \bar{X}_{T})^{T}S^{−1}_{T}(x − \bar{X}_{T})$$`

---
### Chattopadhyay and Zubizarreta 2023

If there is causal heterogeneity, the OLS weights are a biased estimator of ATE, but the MRI weights can be an unbiased estimator.

---

``` r
library(lmw)

qog2000demsin2020clean <- qog2000demsin2020 %>% filter(!is.na(br_pres) & !is.na(ccp_systyear) & 
                                                         !is.na(ht_colonial) & !is.na(br_pvote))
```

---

``` r
preslmw.uri <- lmw(~ br_pres + as.factor(ht_colonial) + br_pvote, data = qog2000demsin2020clean,
                estimand = "ATT", method = "URI", treat = "br_pres")

preslmw.urifit <- lmw_est(preslmw.uri, outcome = "vdem_libdem")
```

```
## Warning in meatHC(x, type = type, omega = omega): HC3 covariances are
## numerically unstable for hat values close to 1 (and undefined if exactly 1) as
## for observation(s) 29, 44
```

``` r
summary(preslmw.urifit)
```

```
## 
## Effect estimates:
##              Estimate Std. Error 95% CI L 95% CI U t value Pr(>|t|)
## E[Y₁-Y₀|A=1]  0.02045    0.03772 -0.05535  0.09625   0.542     0.59
## 
## Residual standard error: 0.1371 on 49 degrees of freedom
```

---

``` r
preslmw.mri <- lmw(~ br_pres + as.factor(ht_colonial) + br_pvote, data = qog2000demsin2020clean,
                   estimand = "ATT", method = "MRI", treat = "br_pres")

preslmw.mrifit <- lmw_est(preslmw.mri, outcome = "vdem_libdem")
```

```
## Warning in meatHC(x, type = type, omega = omega): HC3 covariances are
## numerically unstable for hat values close to 1 (and undefined if exactly 1) as
## for observation(s) 11, 29, 44
```

``` r
summary(preslmw.mrifit)
```

```
## 
## Effect estimates:
##              Estimate Std. Error 95% CI L 95% CI U t value Pr(>|t|)
## E[Y₁-Y₀|A=1]  0.02144    0.03880 -0.05667  0.09955   0.552    0.583
## 
## Residual standard error: 0.1396 on 46 degrees of freedom
## 
## Potential outcome means:
##           Estimate Std. Error 95% CI L 95% CI U t value Pr(>|t|)    
## E[Y₀|A=1]  0.62856    0.03964  0.54877  0.70835   15.86   <2e-16 ***
## E[Y₁|A=1]  0.65000    0.03184  0.58590  0.71410   20.41   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

---

``` r
plot(preslmw.mri)
```

---

``` r
 plot(preslmw.mri, type="extrapolation", variables=~ht_colonial)
```

---

``` r
plot(preslmw.mri, type="influence", outcome="vdem_libdem")
```

---

``` r
qog2000demsin2020clean$cname[47]
```

```
## [1] "Poland"
```

``` r
qog2000demsin2020clean$br_pres[47]
```

```
## [1] 1
```

``` r
qog2000demsin2020clean$ht_colonial[47]
```

```
## [1] 0
```

``` r
qog2000demsin2020clean$ccp_systyear[47]
```

```
## [1] 1997
```

``` r
qog2000demsin2020clean$br_pvote[47]
```

```
## [1] 1
```

---

``` r
qog2000demsin2020clean$vdem_libdem[47]
```

```
## [1] 0.487
```

---

``` r
hist(qog2000demsin2020clean$vdem_libdem[qog2000demsin2020clean$ht_colonial==0])
```

---

``` r
with(qog2000demsin2020clean, table(ht_colonial, br_pres))
```

```
##            br_pres
## ht_colonial  0  1
##           0 24 11
##           1  0  1
##           2  0  6
##           5  5  5
##           6  0  2
##           7  1  2
##           9  1  0
```

---
<img src="images/htcolonial.png" width="90%" />

---

``` r
qog2000demsin2020nopoland <- qog2000demsin2020clean %>% filter(cname != "Poland")

preslmw.mrinopoland <- lmw(~ br_pres + as.factor(ht_colonial) +  
                           br_pvote, 
                         data = qog2000demsin2020nopoland,
                   estimand = "ATT", method = "MRI", treat = "br_pres")

preslmw.mrifitnopoland <- lmw_est(preslmw.mrinopoland, outcome = "vdem_libdem")
```

```
## Warning in meatHC(x, type = type, omega = omega): HC3 covariances are
## numerically unstable for hat values close to 1 (and undefined if exactly 1) as
## for observation(s) 11, 29, 44
```

``` r
summary(preslmw.mrifitnopoland)
```

```
## 
## Effect estimates:
##      Estimate Std. Error 95% CI L 95% CI U t value Pr(>|t|)
## 
## Residual standard error: 0.1336 on 45 degrees of freedom
## 
## Potential outcome means:
##      Estimate Std. Error 95% CI L 95% CI U t value Pr(>|t|)
```