L4: Effect Modification and Interaction

class: center, middle, inverse, title-slide

# L4: Effect Modification and Interaction
### Jean Morrison
### University of Michigan
### Lecture on 2025-02-03 (updated: 2025-02-02)

---

`$\newcommand{\ci}{\perp\!\!\!\perp}$`

# Lecture Outline

1. Effect Modification
1. Interaction
1. Collapsibility

---

# 1. Effect Modification

---

## Effect Modification Example

- Suppose we have a binary treatment `$A$` and a binary outcome `$Y$`.

- Suppose `$V$` is a binary variable representing a pre-existing co-morbidity.

- We want to know if treatment `$A$` has the same effect in patients with `$V=1$` and patients with `$V = 0$`.

`$\newcommand{\ci}{\perp\!\!\!\perp}$`
---

## Effect Modification Example

`$$E[Y(1) \vert V = 0] - E[Y(0) \vert V = 0] = 0.48 - 0.6 = -0.11$$`
`$$E[Y(1) \vert V = 0] - E[Y(0) \vert V = 1] = 0.71 - 0.4 = 0.31$$`

---

## Effect Modification Definition

- Variable `$V$` is a *modifier* of the effect of `$A$` on `$Y$` if the causal effect differs over strata of `$V$`.

- Effect modification does not care about mechanism. 
  
    + `$V$` does not need to mechanistically alter the effect of `$A$`.
    + `$V$` does not even need to be causally related to `$Y$`.

- Effect modification depends on the choice of effect measurement.

---

## Additive and Multiplicative Modification

- Additive effect modification:

$$
E[Y(1) \vert V = 1] - E[Y(0) \vert V = 1] \neq \\
E[Y(1) \vert V = 0] - E[Y(0) \vert V = 0] 
$$

- Multiplicative effect modification:

$$
\frac{E[Y(1) \vert V = 1]}{E[Y(0) \vert V = 1]} \neq \frac{E[Y(1) \vert V = 0]}{ E[Y(0) \vert V = 0]}
$$
--

- Does additive modification imply multiplicative modification? What about vice-versa?

---
## Additive vs Multiplicative Modification

<br>

---
## Types of Effect Modification

1. The causal effect has the same direction in all levels of `$V$`.

- There may be effect modification only on the additive scale or only on the multiplicative scale.
  
1. The causal effect is exactly zero in at least one stratum of `$V$`.

- If this type of effect modification is present on one scale, it will be present on the other.

1. The causal effect has different signs in different strata of `$V$` (*qualitative modification*).

- If this type of effect modification is present on one scale, it will be present on the other.

---

## Effect Modification in DAGs

- Effect modification is hard to represent in DAGs.

- There is no DAG feature that always corresponds to effect modification.

- Effect modifiers are always connected to the outcome by an open path.

---

## Effect Modification in DAGs

- In all of the DAGs below, `$V$` could be a modifier of the effect of `$A$` on `$Y$`.

<div id="htmlwidget-14e75633904f254e322c" style="width:576px;height:216px;" class="grViz html-widget"></div>
<script type="application/json" data-for="htmlwidget-14e75633904f254e322c">{"x":{"diagram":"digraph {\n\ngraph [layout = \"neato\",\n       outputorder = \"edgesfirst\",\n       bgcolor = \"white\"]\n\nnode [fontname = \"Helvetica\",\n      fontsize = \"10\",\n      shape = \"circle\",\n      fixedsize = \"true\",\n      width = \"0.5\",\n      style = \"filled\",\n      fillcolor = \"aliceblue\",\n      color = \"gray70\",\n      fontcolor = \"gray50\"]\n\nedge [fontname = \"Helvetica\",\n     fontsize = \"8\",\n     len = \"1.5\",\n     color = \"gray80\",\n     arrowsize = \"0.5\"]\n\n  \"1\" [label = \"V\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"-0.5,0.5!\"] \n  \"2\" [label = \"A\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"0,0!\"] \n  \"3\" [label = \"Y\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"1,0!\"] \n  \"4\" [label = \"U\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"1.5,0.5!\"] \n  \"5\" [label = \"A\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"2,0!\"] \n  \"6\" [label = \"Y\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"3,0!\"] \n  \"7\" [label = \"V\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"2,1!\"] \n  \"8\" [label = \"U\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"3.5,0.5!\"] \n  \"9\" [label = \"A\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"4,0!\"] \n  \"10\" [label = \"Y\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"5,0!\"] \n  \"11\" [label = \"V\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"circle\", fillcolor = \"#FFFFFF\", pos = \"3.5,1.2!\"] \n  \"12\" [label = \"W\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", shape = \"square\", fillcolor = \"#FFFFFF\", pos = \"4,0.7!\"] \n\"1\"->\"3\" [color = \"black\"] \n\"2\"->\"3\" [color = \"black\"] \n\"4\"->\"6\" [color = \"black\"] \n\"5\"->\"6\" [color = \"black\"] \n\"4\"->\"7\" [color = \"black\"] \n\"8\"->\"10\" [color = \"black\"] \n\"9\"->\"10\" [color = \"black\"] \n\"8\"->\"12\" [color = \"black\"] \n\"11\"->\"12\" [color = \"black\"] \n}","config":{"engine":"dot","options":null}},"evals":[],"jsHooks":[]}</script>

</center>

- In the third graph, `$W$` is in the conditioning set.
  + Why is `$V$` associated with `$Y$`?

- In graphs 2 and 3, `$V$` is a *surrogate effect modifier* because `$V$` does not directly cause `$Y$`.

---

## More on Effect Measures

- If there is a non-zero effect of `$A$` on `$Y$` in at least one stratum of `$V$` and `$E[Y(a) \vert V]$` varies with `$V$` for some value of `$a$` then 
  
- `$V$` <u>always</u> modifies the effect of `$A$` on `$Y$` on either the additive or multiplicative scale (or both).
  
- Hernán and Robins argue that the additive scale is preferable.

---

## Transportability

+ The ATE is the average effect in the population being sampled.

+ An effect estimate is *transportable* if it is a good estimate for the effect in other populations

+ Differences in effect modifiers between populations could lead to lack of transportability.

+ Example: 
  - In our population `$P[V = 1] = 0.5$`. 
  - Average risk difference among those with `$V = 0$` is -0.1.
  - Average risk difference among those with `$V = 1$` is 0.3.
  - What wold be a good effect estimate for a population in which everyone has `$V = 0$`?
  - How about a population in which `$P[V = 1] = 0.25$`?

+ There are no guarantees modifiers are transportable either.

---
## Example

- There are genetic variants that increase susceptibility to nicotine addiction.

- In populations with easy access to smoking tobacco, these variants increase risk of lung cancer.

- Tobacco access is an effect modifier.

- Without accounting for tobacco access, our causal effect estimate is not transportable.

<center>
<div id="htmlwidget-458e57a9129a6af9ca02" style="width:720px;height:180px;" class="grViz html-widget"></div>
<script type="application/json" data-for="htmlwidget-458e57a9129a6af9ca02">{"x":{"diagram":"digraph {\n\ngraph [layout = \"neato\",\n       outputorder = \"edgesfirst\",\n       bgcolor = \"white\"]\n\nnode [fontname = \"Helvetica\",\n      fontsize = \"10\",\n      shape = \"circle\",\n      fixedsize = \"true\",\n      width = \"0.5\",\n      style = \"filled\",\n      fillcolor = \"aliceblue\",\n      color = \"gray70\",\n      fontcolor = \"gray50\"]\n\nedge [fontname = \"Helvetica\",\n     fontsize = \"8\",\n     len = \"1.5\",\n     color = \"gray80\",\n     arrowsize = \"0.5\"]\n\n  \"1\" [label = \"Genotype\", fontname = \"Helvetica\", fontsize = \"20\", width = \"0.3\", fixedsize = \"FALSE\", fontcolor = \"black\", color = \"black\", shape = \"ellipse\", fillcolor = \"#FFFFFF\", pos = \"0,0!\"] \n  \"2\" [label = \"Addiction\n Susceptibility\", fontname = \"Helvetica\", fontsize = \"20\", width = \"0.3\", fixedsize = \"FALSE\", fontcolor = \"black\", color = \"black\", shape = \"ellipse\", fillcolor = \"#FFFFFF\", pos = \"3,0!\"] \n  \"3\" [label = \"Tobacco\n Access\", fontname = \"Helvetica\", fontsize = \"20\", width = \"0.3\", fixedsize = \"FALSE\", fontcolor = \"black\", color = \"black\", shape = \"ellipse\", fillcolor = \"#FFFFFF\", pos = \"6,1.3!\"] \n  \"4\" [label = \"Smoking\", fontname = \"Helvetica\", fontsize = \"20\", width = \"0.3\", fixedsize = \"FALSE\", fontcolor = \"black\", color = \"black\", shape = \"ellipse\", fillcolor = \"#FFFFFF\", pos = \"6,0!\"] \n  \"5\" [label = \"Lung Cancer\", fontname = \"Helvetica\", fontsize = \"20\", width = \"0.3\", fixedsize = \"FALSE\", fontcolor = \"black\", color = \"black\", shape = \"ellipse\", fillcolor = \"#FFFFFF\", pos = \"9,0!\"] \n\"1\"->\"2\" [color = \"black\"] \n\"2\"->\"4\" [color = \"black\"] \n\"4\"->\"5\" [color = \"black\"] \n\"3\"->\"4\" [color = \"black\"] \n}","config":{"engine":"dot","options":null}},"evals":[],"jsHooks":[]}</script>

</center>
---
## Other Reasons to Care About Effect Modification

-  We might be interested in identifying subpopulations with the most to gain from an intervention. 
  + Should we only treat patients with `$V = 1$`?
  
- In some cases, identifying effect modifiers can provide information about the 
mechanism of the causal effect. 
  - In the genetic example, knowing that tobacco access modifies the effect may help us conclude that the genetic variant affects lung cancer because it affects smoking behavior.

---

## Identifying Counterfactual Means in Subgroups

- To characterize effect modification, we need to estimate `$E[Y(a) \vert V = v]$`.

- Under what conditions is `$E[Y(a) \vert V = v]$` identifiable?

- If `$E[Y(a)]$` is identifiable then `$E[Y(a) \vert V = v]$` is identifiable.

- Recall the four conditions for identifying `$E[Y(a)]$`:
--

+ Consistency
  + No interference
  + Positivity - must hold within strata of `$V$`
  + Conditional exchangeablity, conditional on observed variables `$L$`

---

## Estimating Counterfactual Means in Subgroups

We can use a two step estimation procedure:

1. Stratify the data by `$V$`.

2. Use standardization or IP weighting with `$L$` to estimate the expected counterfactual within each level of `$V$`.

---

## Standardization for Subgroup Effects

- The standardized mean of `$Y(a) \vert V = v$` is

$$
E[Y(a) \vert V = v] = \sum_{l}E[Y(a) \vert  V = v, L = l]P[L = l \vert V = v]
$$
$$
 = \sum_l E[Y \vert A = a, L = l, V = v]P[L = l \vert V = v]
$$ 
---

## IPW for Subgroup Effects 
- Or equivalently, the IPW mean:

$$
E[Y(a) \vert V = v] = E \left[\frac{I(A = a)Y}{f_{A \vert L, V =v}(A, L)} \Big\vert V = v\right ]
$$
- We simply compute the IPW mean within strata defined by `$V$`.

---
## Example

I simulated 1000 units from the model

<div id="htmlwidget-1dce438bf14852d4e1d9" style="width:576px;height:216px;" class="grViz html-widget"></div>
<script type="application/json" data-for="htmlwidget-1dce438bf14852d4e1d9">{"x":{"diagram":"digraph {\n\ngraph [layout = \"neato\",\n       outputorder = \"edgesfirst\",\n       bgcolor = \"white\"]\n\nnode [fontname = \"Helvetica\",\n      fontsize = \"10\",\n      shape = \"circle\",\n      fixedsize = \"true\",\n      width = \"0.5\",\n      style = \"filled\",\n      fillcolor = \"aliceblue\",\n      color = \"gray70\",\n      fontcolor = \"gray50\"]\n\nedge [fontname = \"Helvetica\",\n     fontsize = \"8\",\n     len = \"1.5\",\n     color = \"gray80\",\n     arrowsize = \"0.5\"]\n\n  \"1\" [label = \"V\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", fillcolor = \"#FFFFFF\", pos = \"1,1!\"] \n  \"2\" [label = \"A\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", fillcolor = \"#FFFFFF\", pos = \"0,0!\"] \n  \"3\" [label = \"Y\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", fillcolor = \"#FFFFFF\", pos = \"1,0!\"] \n  \"4\" [label = \"L\", fontname = \"Helvetica\", fontsize = \"10\", width = \"0.3\", fontcolor = \"black\", color = \"black\", fillcolor = \"#FFFFFF\", pos = \"0.5,0.8!\"] \n\"1\"->\"3\" [color = \"black\"] \n\"4\"->\"2\" [color = \"black\"] \n\"4\"->\"3\" [color = \"black\"] \n\"2\"->\"3\" [color = \"black\"] \n}","config":{"engine":"dot","options":null}},"evals":[],"jsHooks":[]}</script>
</center>

$$
V \sim Bern(0.5)\qquad L \sim Bern(0.35)\\\
A \sim Bern(0.6-0.3L)\\\
Y(0) \sim Bern(0.5 + 0.2L)\\\
Z \sim Bern(0.2 + 0.3V)\\\
Y(1) = \begin{cases} 0 &  \ \ Y(0) = 0 \\\ 
1- Z & \ \  Y(0) =1 
\end{cases}\\\
$$

---

## Example

If we knew the full counterfactuals, we could compute the conditional effects directly

`$$E[Y(1) \vert V = 0]- E[Y(0) \vert V = 0] = -0.12$$`
`$$E[Y(1) \vert V = 1]- E[Y(0) \vert V = 1] = -0.29$$`

---

## Example

To estimate conditional causal effects in the observed data, we make use of the fact that `$Y(a) \ci A \vert L$` to compute the standardized means.

`$$P[L = 0 \vert V = 0] = \frac{335}{520} = 0.64$$`

$$
`\begin{split}
\hat{E}[Y(1) \vert V = 0] = & E[Y \vert A = 1, V = 0, L = 0 ] P[L = 0 \vert V = 0] + \\
& E[Y \vert A = 1, V =0, L = 1] P[L = 1 \vert V = 0]\\
 = & 0.4\cdot 0.64 + 0.62\cdot 0.36 = 0.48
\end{split}`
$$

---

## Example 
<table class="table" style="width: auto !important; float: left; margin-right: 10px;">
 <thead>
  <tr>
   <th style="text-align:right;"> $L$ </th>
   <th style="text-align:right;"> $V$ </th>
   <th style="text-align:right;"> $A$ </th>
   <th style="text-align:right;"> $N$ </th>
   <th style="text-align:right;"> $\bar{Y}$ </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 133 </td>
   <td style="text-align:right;"> 0.48 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 122 </td>
   <td style="text-align:right;"> 0.55 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 129 </td>
   <td style="text-align:right;"> 0.78 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 101 </td>
   <td style="text-align:right;"> 0.76 </td>
  </tr>
</tbody>
</table>

$$ P[L = 0 \vert V = 0] = \frac{335}{520} = 0.64$$

$$ \hat{E}[Y(1) \vert V = 0] = 0.48 $$
$$
`\begin{split}
\hat{E}[Y(0) \vert V = 0] = & E[Y \vert A = 0, V = 0, L = 0 ] P[L = 0 \vert V = 0] + \\
& E[Y \vert A = 0, V =0, L = 1] P[L = 1 \vert V = 0]\\
 = & 0.48\cdot 0.64 + 0.78\cdot 0.36 = 0.59
\end{split}`
$$

`$$\hat{E}[Y(1) \vert V = 0]-\hat{E}[Y(0) \vert V = 0] = -0.11$$`

---

## Special Case: V = L

- If `$Y(a) \ci A \vert L$` and we are also interested in effect modification by `$L$`, we can skip the step of computing the standardized mean.

- Instead, we simply stratify by `$L$` and compute `$E[Y(a) \vert L = l] = E[Y \vert A = a, L = l]$`.

- These are the *stratified* means.

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:right;"> $L$ </th>
   <th style="text-align:right;"> $A$ </th>
   <th style="text-align:right;"> $N$ </th>
   <th style="text-align:right;"> $\bar{Y}$ </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 255 </td>
   <td style="text-align:right;"> 0.51 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 396 </td>
   <td style="text-align:right;"> 0.32 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 230 </td>
   <td style="text-align:right;"> 0.77 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 119 </td>
   <td style="text-align:right;"> 0.45 </td>
  </tr>
</tbody>
</table>
---

## Special Case: V = A

- If there is effect modification by treatment status, the causal effect among those who received treatment `$A=1$` is different from the causal effect among those who received treatment `$A = 0$`.

- The *average treatment effect among the treated* (ATT) is 
$$
E[Y(1) \vert A = 1] - E[Y(0) \vert A = 1]
$$
- Note that if the ATT is different from the ATE, this implies that unconditional exchangeability ( `$Y(a) \ci A$` ) does not hold.

+ Why?
---

## Average Effect Among the Treated

---

## Identifying the ATT

- To identify the ATT, we don't need full conditional exchangeability which says that 
$$ Y(a) \ci A \vert L$$ for all `$a$`.

- We only need *partial exchangeability*: `$Y(0) \ci A \vert L$`

- Equivalently, to identify the average effect among the non-treated, we need `$Y(1) \ci A \vert L$`.

- Generally, to identify `$E[Y(a) \vert A = a^\prime]$` we need $$ Y(a) \ci A \vert L$$ for all `$a \neq a^\prime$`.

---

## Standardization to Estimate the ATT

- From our previous expression for the conditional standardized mean

$$
E[Y(a) \vert A = 1] = \sum_l E[Y(a) \vert L=l, A = 1 ]P[L = l \vert A = 1]
$$

- From consistency we know
`$$E[Y(1) \vert L = l, A = 1] = E[Y \vert L = l, A = 1]$$`
--

- From partial exchangeability, we know that
$$
E[Y(0) \vert L = l, A = 1] = E[Y(0) \vert L = l]
$$
--

- Consistency again:

$$ E[Y(0) \vert L = l] = E[Y \vert L = l, A = 0]$$
--

- So

$$
E[Y(a) \vert A = 1] = \sum_l E[ Y \vert L=l, A = a ]P[L = l \vert A = 1]
$$

---
## ATT Example

In the simulated data example, the average effect among the treated is

$$
E[Y(1) \vert A = 1] = 0.35\\\
E[Y(0) \vert A = 1] = 0.53\\\
E[Y(1)-Y(0) \vert A = 1] = -0.18
$$
To estimate these values using standardization we compute

$$
P[L = 0 \vert A = 1] = 0.77\\\
P[L = 1 \vert A = 1] = 0.23
$$

$$
\hat{E}[Y(1) \vert A = 1] = 0.32\cdot 0.77 + 
0.45\cdot 0.23 = 0.35\\\
\hat{E}[Y(0) \vert A = 1] = 0.51\cdot 0.77 + 
0.77\cdot 0.23 = 0.57
$$

---

## Matching

- Matching is an alternative method to adjust for confounders `$L$`.

- For each person receiving `$A = 1$`, we identify a "match" with the same value of `$L$` who received `$A = 0$`. 
  + We leave aside the rest of the data. 
  
- In the resulting matched population, the distribution of `$L$` is the same in the `$A=1$` and `$A =0$` cohorts.

- Since we matched to the `$A = 1$` cohort, we are now able to estimate the ATT as `$E_m[Y \vert A = 1] - E_m[Y \vert A = 0]$` where the expectation is with respect to the matched population.

- Alternatively, we could have matched to the `$A = 0$` cohort, or to a different distribution of `$L$`.

---

# 2. Interactions

---

## Interactions Describe Joint Interventions

- An *interaction* between variables refers to relationships between joint counterfactuals.

- We say that there is an additive interaction between `$A$` and `$E$` if

`$$E[Y(A = 1, E = 0)] - E[Y(A = 0, E = 0)] \neq \\E[Y(A = 1, E = 1)] - E[Y(A = 0, E = 1)].$$`

- Like effect modification, interaction depends on the effect measure. 
  + There may be an additive but not a multiplicative interaction or vice versa.

---
## Interactions in DAGs

- Like effect modification, interactions are hard to clearly indicate in DAGs.

- In order for an interaction between `$A$` and `$E$` to occur, `$Y$` must be 
a descendant of both `$A$` and `$E$`.

---
## Example

- `$Y$` indicates whether or not the lamp is on.

- `$A$` indicates if there is a bulb in the lamp.

- `$E$` indicates if the lamp is plugged in.

- The lamp is on if it is pulgged in and has a bulb:  `$Y(1, 1) = 1$`

- Otherwise it is off: `$Y(1, 0) = Y(0, 1) = Y(0, 0) = 0$`.

- There is an interaction because `$Y(1, 0) - Y(0, 0) = 0$` and `$Y(1, 1) - Y(0, 1) = 1$`. 
  + Adding a bulb is effective if the lamp is plugged in but otherwise is ineffective.
  
<center>
<img src="img/3_swig_ix.png" width="35%" />
</center>
---

## Identifying Interactions

- In order to identify an interaction, we must be able to identify `$Y(a, e)$` for all values of `$a$` and `$e$`.

- We need our usual four identification criteria, but they need to apply to the joint counterfactual.

- Consistency: `$A_i = a$` and `$E_i = e$` `$\Rightarrow$` `$Y_i = Y_i(a, e)$`.

- Positivity: `$P[A = a, E=e \vert L = l] > 0$` for all `$a$`, `$e$`, `$l$`.

- Exchangeability: `$Y(a, e) \ci A, E \vert L$`

- If these hold, we can estimate `$E[Y(a, e)]$` using the same standardization or IPW strategy we used for single interventions. 
---

## Effect Modification vs Interaction

- When `$Y(a, e) \ci A, E$`, we have 
$$
E[Y(a, e)] = E[Y(a) \vert E = e] = E[Y(e) \vert A = a]= E[Y \vert A=a, E=e]
$$

- So interaction between `$A$` and `$E$` implies that `$E$` is a modifier of the effect of `$A$` on `$Y$`.

- Or equivalently, `$A$` is a modifier of the effect of `$E$` on `$Y$`.

- If we are only willing to assume `$Y(a) \ci A \vert L$`, then we can identify modification but not interaction.

---

# 3. Collapsibility

---
## Collapsibility

- An association measure is *collapsible* with respect to a variable `$Z$` if the measure in the entire population is equal to a weighted average of the measure within strata.

- The average treatment effect and risk ratio are collapsible: 
$$
E[Y(1)] - E[Y(0)] = \sum_{z}(E[Y(1) \vert Z=z]-E[Y(0) \vert Z=z])P[Z=z]
$$

$$
\frac{E[Y(1)]}{E[Y(0)]} = \sum_z \frac{E[Y(1) \vert Z=z]}{E[Y(0) \vert Z=z]}w_z
$$
$$
w_z = \frac{E[Y(0)\vert Z=z]P[Z=z]}{E[Y(0)]}
$$
- For ATE and RR, if we know strata specific effects, we know the possible range of the population effect.

- This is not the case for the odds ratio.

---
## Collapsibility of Association Measures

- Effect measures are functions of the counterfactual distribution.

- Association measures are functions of distribution of the observed data.

- `$E[Y(1)] - E[Y(0)]$` is an effect measure. `$E[Y \vert A = 1] - E[Y \vert A = 0]$` is an association measure.

- The same definition of collapsibility applies to association measures.

- If `$g(P(a, y))$` is an association measure, then `$g$` is collapsible over `$Z$` if
  $$g(P(a, y)) = \sum_z g(P(a, y \vert z))w(z) $$
  where `$w(z) \geq 0$` and `$\sum_z w(z) = 1.$`

---

## Strict Collapsibility

- An association measure is strictly collapsible over `$A$` if `$g(P(a, y \vert z)) = g(P(a, y))$` for all values of `$z$`.

- Strict collapsibility says that the association measure is the same within strata as in the population.

---

## Collapsibility and Confounding

- One definition of confounding says that if `$g(P(a, y \vert z)) \neq g(P(a, y))$`, then `$Z$` is a confounder and we should adjust for it.

- Is this definition correct if `$g$` is the risk difference?

- What about risk ratio?

- What about odds ratio?

---

## Collapsibility and Confounding

**Theorem:** If `$g$` is the risk difference, faithfulness holds, and `$g$` is strictly collapsible over `$Z$` then `$A$` and `$Y$` are unconfounded by `$Z$`.

- The reverse is not true, if `$g$` is not strictly collapsible over `$Z$`, we can't conclude that `$Z$` is a confounder.

- This theorem does not hold for the odds ratio.

---
## Non-Collapsibility of the Odds Ratio

<table>
 <thead>
<tr>
<th style="empty-cells: hide;border-bottom:hidden;" colspan="1"></th>
<th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Male</div></th>
<th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Female</div></th>
<th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Combined</div></th>
</tr>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:left;"> $A = 0$ </th>
   <th style="text-align:left;"> $A = 1$ </th>
   <th style="text-align:left;"> $A = 0$ </th>
   <th style="text-align:left;"> $A = 1$ </th>
   <th style="text-align:left;"> $A = 0$ </th>
   <th style="text-align:left;"> $A = 1$ </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> $Y = 0$ </td>
   <td style="text-align:left;"> 100 </td>
   <td style="text-align:left;border-right: solid;"> 50 </td>
   <td style="text-align:left;"> 200 </td>
   <td style="text-align:left;border-right: solid;"> 150 </td>
   <td style="text-align:left;"> 300 </td>
   <td style="text-align:left;"> 200 </td>
  </tr>
  <tr>
   <td style="text-align:left;border-bottom: solid;"> $ Y= 1$ </td>
   <td style="text-align:left;border-bottom: solid;"> 150 </td>
   <td style="text-align:left;border-bottom: solid;border-right: solid;"> 200 </td>
   <td style="text-align:left;border-bottom: solid;"> 50 </td>
   <td style="text-align:left;border-bottom: solid;border-right: solid;"> 100 </td>
   <td style="text-align:left;border-bottom: solid;"> 200 </td>
   <td style="text-align:left;border-bottom: solid;"> 300 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Risk </td>
   <td style="text-align:left;"> 0.6 </td>
   <td style="text-align:left;border-right: solid;"> 0.8 </td>
   <td style="text-align:left;"> 0.2 </td>
   <td style="text-align:left;border-right: solid;"> 0.4 </td>
   <td style="text-align:left;"> 0.4 </td>
   <td style="text-align:left;"> 0.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Risk Difference </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;border-right: solid;"> 0.2 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;border-right: solid;"> 0.2 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;"> 0.2 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Risk Ratio </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;border-right: solid;"> 1.33 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;border-right: solid;"> 2 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;"> 1.5 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Odds Ratio </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;border-right: solid;"> 2.67 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;border-right: solid;"> 2.67 </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;"> 2.25 </td>
  </tr>
</tbody>
</table>