Regression: How to use it for causal inference?
A key question of the regression theory is, given $Y=X’\beta + \varepsilon$ and the estimated coefficients $\widehat{\beta} =\text{argmin }E(Y-X’\beta)^2$, when does $\widehat{\beta}=\beta$ ?
We first discuss the property of $\widehat{\beta}$ when we does not require $Y=X’\beta+\varepsilon$ to be well behaved, e.g., $X$ could be incomplete and hence introduces OVB.
Then we discuss under what conditions does $\widehat{\beta}=\beta$.
1 The property of $\widehat{\beta}$
The following is always true: $$ \widehat{\beta}_{k} =\frac{Cov(Y,\hat{x}_k)}{Var(\hat{x}_{k})} $$ where $\hat{x}_k$ is the residual of $x_{k}$ regressed on all the other covariates.
Particularly, for univariate regression $Y=\alpha+\beta x+\varepsilon$, we have $$ \widehat{\beta}=\frac{Cov(Y,x)}{Var(x)} $$ However, we can’t guarantee $\widehat{\beta}=\beta$ in this general case.
2 When does $\widehat{\beta}=\beta$ ?
The key condition is $E(X\varepsilon)=0$, that is, $\varepsilon$ and $X$ are uncorrelated.
When $E(X\varepsilon)=0$ is met, not only do we have $\widehat{\beta}=\beta$, we can also guarantee:
$$ \begin{align} E(Y\varepsilon)&=0 \\ E(\varepsilon\cdot f(X))&=0 \hspace{2em}\text{f(X) is arbitrary function of X} \\ \end{align} \\ $$To satisfy $E(X\varepsilon)=0$, we have several choices. If any of the following conditions are met, then we can guarantee $E(X\varepsilon)=0$:
- No omitted variables
- Even if we have omitted variables, prove that $X$ and the omitted variables are uncorrelated.
- $X’\beta=E(Y|X)$