1: Potential Outcomes and Experiments

class: center, middle, inverse, title-slide

.title[
# 1: Potential Outcomes and Experiments
]
.subtitle[
## Quantitative Causal Inference
]
.author[
### <large>J. Seawright</large>
]
.institute[
### <small>Northwestern Political Science</small>
]
.date[
### April 3, 2025
]

---

class: center, middle

pre[class] {
  max-height: 200px;
}
</style>

---
### Business

-   Presenting an estimator

-   Weekly lab assignments

-   Final grant proposal

---
### What makes an experiment work?

---
### Potential Outcomes

-   Let's build a mathematical model of one person participating in an experiment.

---
### Potential Outcomes

-   Suppose we name that person `$i$`.

-   The person we're interested in has an outcome of 1 when in the control group, and
    0 when in the treatment group.

---
### Potential Outcomes

-   `$y_{i,c} = 0$`

-   `$y_{i,t} = 1$`

---
### The Potential Outcomes Framework

-   We are interested in the effects of a dichotomous treatment (i.e.,
    independent variable): whether person got the treatment (t) or the
    control (c).

-   This variable can be written as `$W_{i} = (t,c)$`.

---
### The Potential Outcomes Framework

-   For a given case, `$i$`, we either observe `$W_{i} = t$` or `$W_{i} = c$`.
    If `$W_{i} = t$`, let us denote the value of the dependent variable as
    `$y_{i,t}$`. If `$W_{i} = c$`, let us denote the value of the dependent
    variable as `$y_{i,c}$`

---
### The Potential Outcomes Framework

-   The causal effect of `$W$` on `$y$` is:

-   `$y_{i,t} - y_{i,c}$`
    
    For hypothetical person `$i$` above, this effect is `$1 - 0 = 1$`.

---
### The Average Treatment Effect

-   Sometimes, we are interested in developing an estimate of the effect
    of `$W$` on `$y$` in some population `$\Pi$`, from which we have a random
    sample (or even the whole population) split randomly into treatment
    and control cases.

-   Here, interest focuses on the "average treatment effect":

-   `$E(y_{i,t}) - E(y_{i,c})$`

---
### The Average Treatment Effect

What helps us estimate the average treatment effect well?

---
### Assignment Mechanisms

-   `$\mathbb{W}$` has a probability distribution.

-   `$Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1})$`

`$$p_{i}(\mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1}) = \sum_{\mathbb{W}: W_{i} = 1} Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1})$$`

`$\mathbb{W}$`, `$\mathbb{X}$`, `$\mathbb{Y}_{0}$`, and `$\mathbb{Y}_{1}$` take
on joint values that are drawn from some describable set of
possibilities.

---
### Individualistic Assignment

1.  There exists a function `$q()$` that is bounded between 0 and 1, such
    that
    `$p_{i}(\mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1}) = q(X_{i}, Y_{0,i}, Y_{1,i})$`

2.  `$Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1})$` is the
    product of those individual probabilities.

---
### Probabilistic Assignment

1.  For all permissible values of `$\mathbb{X}$`, `$\mathbb{Y}_{0}$`, and
    `$\mathbb{Y}_{1}$`,
    `$0 < p_{i}(\mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1}) < 1$`.

---
### Unconfounded Assignment

1.  `$Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}_{0}, \mathbb{Y}_{1}) = Pr(\mathbb{W} | \mathbb{X}, \mathbb{Y}^{'}_{0}, \mathbb{Y}^{'}_{1})$`

---
### Experiments and Causal Inference

-   Under individualistic, probabilistic, and unconfounded assignment, the set of cases
    where `$W_{i} = t$` produces a random sample from the population of
    `$y_{t}$`. Likewise, the set of cases where `$W_{i} = c$` produces a
    random sample from the population of `$y_{c}$`. Thus:

-   `$E_{t}(y_{i,t}) = E(y_{i,t})$`

-   `$E_{c}(y_{i,c}) = E(y_{i,c})$`

-   `$E(y_{i,t}) - E(y_{i,c}) = E_{t}(y_{i,t}) - E_{c}(y_{i,c})$`

---

``` r
mean(peruemotions$outsidervote[peruemotions$simpletreat==1])
```

```
## [1] 0.6092715
```

``` r
mean(peruemotions$outsidervote[peruemotions$simpletreat==0])
```

```
## [1] 0.4916388
```

``` r
mean(peruemotions$outsidervote[peruemotions$simpletreat==1]) - mean(peruemotions$outsidervote[peruemotions$simpletreat==0])
```

```
## [1] 0.1176327
```

---

``` r
summ(lm(outsidervote ~ simpletreat, data=peruemotions))
```

<table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;">
<tbody>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Observations </td>
   <td style="text-align:right;"> 450 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Dependent variable </td>
   <td style="text-align:right;"> outsidervote </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Type </td>
   <td style="text-align:right;"> OLS linear regression </td>
  </tr>
</tbody>
</table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;">
<tbody>
  <tr>
   <td style="text-align:left;font-weight: bold;"> F(1,448) </td>
   <td style="text-align:right;"> 5.62 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> R² </td>
   <td style="text-align:right;"> 0.01 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> Adj. R² </td>
   <td style="text-align:right;"> 0.01 </td>
  </tr>
</tbody>
</table> <table class="table table-striped table-hover table-condensed table-responsive" style="width: auto !important; margin-left: auto; margin-right: auto;border-bottom: 0;">
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Est. </th>
   <th style="text-align:right;"> S.E. </th>
   <th style="text-align:right;"> t val. </th>
   <th style="text-align:right;"> p </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;font-weight: bold;"> (Intercept) </td>
   <td style="text-align:right;"> 0.49 </td>
   <td style="text-align:right;"> 0.03 </td>
   <td style="text-align:right;"> 17.10 </td>
   <td style="text-align:right;"> 0.00 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;"> simpletreat </td>
   <td style="text-align:right;"> 0.12 </td>
   <td style="text-align:right;"> 0.05 </td>
   <td style="text-align:right;"> 2.37 </td>
   <td style="text-align:right;"> 0.02 </td>
  </tr>
</tbody>
<tfoot><tr><td style="padding: 0; " colspan="100%">
<sup></sup> Standard errors: OLS</td></tr></tfoot>
</table>

---
### Assumptions

What do we need to assume in order to get a causal inference here?

---
### Randomization Inference

-   It is clearly important to test the null hypothesis that
    `$E(y_{i,t}) = E(y_{i,c})$` for all `$i$`.

-   If this hypothesis is true, then every case's treatment assignment
    is unrelated to its outcome.

-   Thus, under the null hypothesis, it is fine for us to reassign
    treatment at random --- the outcome won't change.

---
### Randomization Inference

-   Randomization inference involves:

1.  Randomly reordering the treatment condition vector

2.  Calculating the difference between the treatment and control
        group for the new (artificial) treatment condition vector

3.  Storing the result somewhere

4.  Repeating the whole process hundreds or thousands of times.

---
### Randomization Inference

-   If the null hypothesis is true, then the distribution of simulated
    differences in means is the sampling distribution from which the
    real difference in means was drawn.

-   Therefore, a good `$P$` value for our observed difference in means is
    the proportion of simulated differences in means that are at least
    as far from 0 as the real number.

---
### Assumptions

When analyzing an experiment using randomization inference, we do *not*
need to assume that:

-   we know and can measure all (or even any!) of the confounding
    variables for the relationship of interest.

-   causal effects are additive or linear.

-   causal effects are constant across cases.

-   errors are normally distributed or heteroskedastic.

---
### Assumptions

When analyzing an experiment using randomization inference, we *do* need
to assume that:

-   SUTVA (stable unit treatment value assumption) holds.

-   Experimental/psychological realism holds.

---

``` r
library(ri2)
```

```
## Loading required package: randomizr
```

```
## Loading required package: estimatr
```

``` r
emotions_declaration <- declare_ra(N = 450, m = 151)

emotions_table <- data_frame(Z = peruemotions$simpletreat,
                             Y = peruemotions$outsidervote)
```

```
## Warning: `data_frame()` was deprecated in tibble 1.1.0.
## ℹ Please use `tibble()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
```

---

``` r
ri2_emotionsresult <- conduct_ri(
  formula = Y ~ Z,
  declaration = emotions_declaration,
  sharp_hypothesis = 0,
  data = emotions_table
)
```

---

``` r
ri2_emotionsresult
```

```
##   term  estimate two_tailed_p_value
## 1    Z 0.1176327              0.016
```

---
### Balance Testing

---

``` r
library(cobalt)
```

```
##  cobalt (Version 4.5.5, Build Date: 2024-04-02)
```

``` r
peruemotionscovs <- subset(peruemotions, select = c(Cuzco, age))
bal.tab(peruemotionscovs, treat=peruemotions$simpletreat, 
        thresholds = c(m = .1, v = 2))
```

```
## Warning: Missing values exist in the covariates. Displayed values omit these
## observations.
```

```
## Note: `s.d.denom` not specified; assuming "pooled".
```

```
## Balance Measures
##             Type Diff.Un M.Threshold.Un V.Ratio.Un V.Threshold.Un
## Cuzco     Binary  0.0598 Balanced, <0.1          .               
## age      Contin.  0.0476 Balanced, <0.1     1.0738   Balanced, <2
## age:<NA>  Binary  0.0097 Balanced, <0.1          .               
## 
## Balance tally for mean differences
##                    count
## Balanced, <0.1         3
## Not Balanced, >0.1     0
## 
## Variable with the greatest mean difference
##  Variable Diff.Un M.Threshold.Un
##     Cuzco  0.0598 Balanced, <0.1
## 
## Balance tally for variance ratios
##                  count
## Balanced, <2         1
## Not Balanced, >2     0
## 
## Variable with the greatest variance ratio
##  Variable V.Ratio.Un V.Threshold.Un
##       age     1.0738   Balanced, <2
## 
## Sample sizes
##     Control Treated
## All     299     151
```
---
### Encouragement Designs

---
### Regression in Experiments

Recall that `$W_{i}$` as an indicator of treatment assignment in an
experiment. A bivariate regression predicting the dependent variable based on `$W$`
is algebraically the same as the difference in means we've looked at above, so it 
is necessarily an okay causal inference.

---
### Regression in Experiments

Multivariate regression for experiments is *not* guaranteed to be
unbiased in the way that bivariate regression is. Multivariate
regression can be (even very badly) biased if:

1.  Some control variable included in the model is caused by the
    treatment, or

2.  The causal effect of interest is very heterogeneous across cases and
    the total number of cases is small
    
---

``` r
summary(lm(outsidervote ~ simpletreat, data=peruemotions))
```

```
## 
## Call:
## lm(formula = outsidervote ~ simpletreat, data = peruemotions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6093 -0.4916  0.3907  0.5084  0.5084 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.49164    0.02874  17.104   <2e-16 ***
## simpletreat  0.11763    0.04962   2.371   0.0182 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.497 on 448 degrees of freedom
## Multiple R-squared:  0.01239,	Adjusted R-squared:  0.01018 
## F-statistic:  5.62 on 1 and 448 DF,  p-value: 0.01818
```

---

``` r
summary(lm(outsidervote ~ simpletreat + risk, data=peruemotions))
```

```
## 
## Call:
## lm(formula = outsidervote ~ simpletreat + risk, data = peruemotions)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.04217 -0.34282  0.07292  0.34842  1.03717 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.0224427  0.0511549  19.987   <2e-16 ***
## simpletreat  0.0197234  0.0456917   0.432    0.666    
## risk        -0.0105961  0.0009036 -11.726   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4294 on 402 degrees of freedom
##   (45 observations deleted due to missingness)
## Multiple R-squared:  0.2624,	Adjusted R-squared:  0.2587 
## F-statistic: 71.49 on 2 and 402 DF,  p-value: < 2.2e-16
```
---
### Causal Mediation

Suppose we want to know the causal steps by which treatment affects `$Y$`.

Let `$M$` be a hypothesized mediator, i.e., a variable caused by treatment
that causes `$Y$`. Because `$M$` is affected by `$W$`, `$M$` is a sort of
dependent variable. Let us denote two potential outcomes: `$M_{i,t}$` and
`$M_{i,c}$`.

`$Y$` depends on `$W$` and `$M$`, so there are now four potential outcomes on
`$Y$`: `$Y_{i}(t,M_{i,t})$`, `$Y_{i}(t,M_{i,c})$`, `$Y_{i}(c,M_{i,t})$`, and
`$Y_{i}(c,M_{i,c})$`

---
### Causal Mediation

The two causal mediation effects for each case are:

`$$\begin{aligned}
\delta_{i,t} = Y_{i}(t,M_{i,t}) - Y_{i}(t,M_{i,c})\end{aligned}$$`

`$$\begin{aligned}
\delta_{i,c} = Y_{i}(c,M_{i,t}) - Y_{i}(c,M_{i,c})\end{aligned}$$`

---
### Causal Mediation

To make inferences, we need the assumptions:

`$$\begin{aligned}
Y_{i}(t,M_{i,t}) \perp\!\!\!\perp W_{i} | X_{i} = x\end{aligned}$$`

`$$\begin{aligned}
M_{i,t} \perp\!\!\!\perp W_{i} | X_{i} = x\end{aligned}$$`

`$$\begin{aligned}
Y_{i}(t^{'},m) \perp\!\!\!\perp M_{i,t} | (W_{i} = t, X_{i} = x) \end{aligned}$$`

---

``` r
library(mediation)
```

```
## Loading required package: MASS
```

```
## 
## Attaching package: 'MASS'
```

```
## The following object is masked from 'package:dplyr':
## 
##     select
```

```
## Loading required package: Matrix
```

```
## Loading required package: mvtnorm
```

```
## 
## Attaching package: 'mvtnorm'
```

```
## The following object is masked from 'package:jtools':
## 
##     standardize
```

```
## Loading required package: sandwich
```

```
## mediation: Causal Mediation Analysis
## Version: 4.5.0
```

---

``` r
peruemotionsmed <- with(peruemotions, na.omit(data.frame(risk=risk,
                                                       outsidervote = outsidervote,
                                                       simpletreat=simpletreat,
                                                       Cuzco=Cuzco,
                                                       age=age)))

perumed.lm1 <- lm(risk ~ simpletreat, data=peruemotionsmed)
perumed.lm2 <- lm(outsidervote ~ risk + simpletreat + Cuzco + age, data=peruemotionsmed)
perumed.out <- mediate(perumed.lm1, perumed.lm2, treat="simpletreat", mediator = "risk")
```

---

``` r
summary(perumed.out)
```

```
## 
## Causal Mediation Analysis 
## 
## Quasi-Bayesian Confidence Intervals
## 
##                Estimate 95% CI Lower 95% CI Upper p-value   
## ACME             0.0813       0.0299         0.13   0.002 **
## ADE              0.0172      -0.0761         0.11   0.716   
## Total Effect     0.0986      -0.0116         0.20   0.088 . 
## Prop. Mediated   0.7696      -2.9748         4.26   0.090 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 398 
## 
## 
## Simulations: 1000
```
---

``` r
plot(perumed.out)
```

---

``` r
perumedsens.out <- medsens(perumed.out, rho.by = 0.1, effect.type = "indirect", sims = 100)
```

---

``` r
plot(perumedsens.out)
```

---
### Heterogeneity

---

``` r
summary(lm(outsidervote ~ simpletreat + Cuzco + simpletreat:Cuzco + age + simpletreat:age, data=peruemotions))
```

```
## 
## Call:
## lm(formula = outsidervote ~ simpletreat + Cuzco + simpletreat:Cuzco + 
##     age + simpletreat:age, data = peruemotions)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6325 -0.5234  0.3705  0.4750  0.5903 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.5045928  0.0875077   5.766 1.55e-08 ***
## simpletreat        0.1957848  0.1525120   1.284   0.1999    
## Cuzco             -0.1121455  0.0639902  -1.753   0.0804 .  
## age                0.0007842  0.0027246   0.288   0.7736    
## simpletreat:Cuzco  0.0924688  0.1065091   0.868   0.3858    
## simpletreat:age   -0.0038673  0.0045507  -0.850   0.3959    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4981 on 432 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.01819,	Adjusted R-squared:  0.006829 
## F-statistic: 1.601 on 5 and 432 DF,  p-value: 0.1585
```

---
### Population Inference

-   Unrepresentative samples

-   Let `$Z_{i}$` be a dichotomous variable that represents whether a case
    is in the experimental sample

---
### Population Inference

Assume:

`$$\begin{aligned}
f(Y_{i,1} - Y_{i,0} | Z_{i}, X_{i}) = f(Y_{i,1} - Y_{i,0} | X_{i})\end{aligned}$$`

---
### Population Inference

Assume, for all possible `$X^{*}$`:

`$$\begin{aligned}
P(Z_{i} = 1 | X_{i} = X^{*}) > 0\end{aligned}$$`

---
### Population Inference

O'Muircheartaigh and Hedges propose:

1.  Let `$\mathbf{x}$` be the collection of all observed combination of
    values `$X^{*}$`.

2.  Let `$T(x)$` be the sample average of `$Y_{i,1} - Y_{i,0}$` across all
    `$i$` such that `$X_{i} = x$`.

3.  Let `$p(x)$` be the proportion of the population with `$X_{i} = x$`.

---
### Population Inference

O'Muircheartaigh and Hedges propose:

4.  `$PATE \approx \sum_{x \in \mathbf{x}}(p(x) T(x))$`

---

``` r
library(devtools)
```

```
## Loading required package: usethis
```

``` r
if (!require("generalize")) devtools::install_github('benjamin-ackerman/generalize')
```

```
## Loading required package: generalize
```

``` r
library(generalize)
?assess
```

```
## starting httpd help server ...
```

```
##  done
```

---

``` r
?generalize
```

---
<img src="images/Bush1.png" width="90%" />

---
<img src="images/Bush2.png" width="100%" />

---
<img src="images/Bush3.png" width="100%" />

---
<img src="images/Cheema1.png" width="100%" />

---
<img src="images/Cheema2.png" width="100%" />

---
<img src="images/Cheema3.png" width="100%" />

---
<img src="images/Cheema4.png" width="100%" />