Regression: How to use it for causal inference?
A key question in regression theory is: given $Y=X’\beta + \varepsilon$ and the estimated coefficients $\widehat{\beta} =\operatorname{argmin} E(Y-X’\beta)^2$, when does $\widehat{\beta}=\beta$?
We first discuss the properties of $\widehat{\beta}$ when we do not require $Y=X’\beta+\varepsilon$ to be well behaved (e.g., $X$ could be incomplete and hence introduces OVB).
Then we discuss under what conditions $\widehat{\beta}=\beta$.
1 The properties of $\widehat{\beta}$
The following is always true: $$ \widehat{\beta}_{k} =\frac{\operatorname{Cov}(Y,\hat{x}_k)}{\operatorname{Var}(\hat{x}_{k})} $$ where $\hat{x}_k$ is the residual of $x_{k}$ regressed on all the other covariates.
In particular, for univariate regression $Y=\alpha+\beta x+\varepsilon$, we have $$ \widehat{\beta}=\frac{\operatorname{Cov}(Y,x)}{\operatorname{Var}(x)} $$ However, we cannot guarantee $\widehat{\beta}=\beta$ in this general case.
2 When does $\widehat{\beta}=\beta$?
The key condition is $E(X\varepsilon)=0$, that is, $\varepsilon$ and $X$ are uncorrelated.
When $E(X\varepsilon)=0$ holds, not only do we have $\widehat{\beta}=\beta$, we can also guarantee:
$$ \begin{align} E(Y\varepsilon)&=0 \\ E(\varepsilon\cdot f(X))&=0 \hspace{2em}\text{where $f(X)$ is an arbitrary function of $X$} \end{align} $$To satisfy $E(X\varepsilon)=0$, we have several choices. If any of the following conditions are met, then we can guarantee $E(X\varepsilon)=0$:
- No omitted variables
- Even if we have omitted variables, prove that $X$ and the omitted variables are uncorrelated.
- $X’\beta=E(Y\mid X)$