Synthetic Control

Advantages of synthetic control

The weights are transparent. In synthetic control, the contribution of each untreated group is clearly shown, while in regression the contribution is implicit unless you do a Bacon decomposition.

The choice of synthetic control requires no post‑treatment data.

The following explanation is adapted from Abadie et al. (2010).

YitY_{it} is the outcome where i=1,,J+1i=1,\dots,J+1 and t=1,,Tt=1,\dots,T. YitNY_{it}^N and YitIY_{it}^I are the counterfactual outcomes for the untreated and treated. The treated group is denoted by i=1i=1 and the control groups (the donor pool) by i=2,J+1i=2,\dots J+1. The treatment is applied at T0T_{0} where 1T0<T1\leq T_{0}<T.

WW is a (J×1)(J\times 1) vector (w2,w3,,wJ+1)(w_{2},w_{3},\dots,w_{J+1})’ with nonnegative elements. The synthetic control at time tt is j=2J+1wjYjtN\sum_{j=2}^{J+1} w_{j}Y_{jt}^N. The treatment effect at tt is Y1tIj=2J+1wjYjtNY_{1t}^I - \sum_{j=2}^{J+1} w_{j}Y_{jt}^N.

The key task is to estimate WW. Consider a (k×1)(k\times 1) vector X1=(Z1,Y1(1),Y1(2),,Y1(m))X_{1}=(Z_{1}, Y_{1}^{(1)}, Y_{1}^{(2)},\dots,Y_{1}^{(m)}). Z1Z_{1} is a vector of covariates for the treated—typically predictors of Y1Y_{1}. Y1(m)Y_{1}^{(m)} is a combination of pre‑treatment Y1Y_{1}. The most obvious choice is Y1(1)=Y11,,Y1(m)=Y1T0Y_{1}^{(1)}=Y_{11}, \dots, Y_{1}^{(m)}=Y_{1T_{0}}, i.e., the Y1tY_{1t} in every pre‑treatment year. Conceptually, X1X_{1} captures the characteristics of the treated. Note there is no tt subscript in X1X_{1}, indicating it is an average over pre‑treatment periods (e.g., average per‑capita GDP).

Now consider a (k×J)(k\times J) matrix X0X_{0}. X0X_{0} is similar to X1X_{1} but captures the characteristics of all units in the donor pool. We estimate WW by minimizing:

minW(X0X1W)V(X0X1W)(1) \min_{W} (X_{0}-X_{1}W)'V(X_{0}-X_{1}W) \hspace{3em}(1)

i.e., minimizing the distance in characteristics between the treated and the synthetic control. VV is typically a semi‑positive diagonal matrix whose elements determine the importance of each covariate in XX. VV is chosen by:

V=argmin(Z1Z0W(V))(Z1Z0W(V))(2)V^*=\arg\min (Z_{1}-Z_{0}W^*(V))'(Z_{1}-Z_{0}W^*(V)) \hspace{3em}(2)

where ZZ is the pre‑treatment trajectory of YY, i.e., Z1=(Y11,,Y1T0)Z_{1}=(Y_{11},\dots,Y_{1T_{0}}). Eq. (2) says VV is determined by minimizing the prediction error of the outcome.

Once VV is given, compute WW with Eq. (1).

Only pre‑treatment data is needed
X0X_{0}, X1X_{1}, Z1Z_{1}, and Z0Z_{0} are all from pre‑treatment periods. That’s an advantage of synthetic control: It lets researchers design studies without relying on post‑treatment outcomes.

Fix the treatment date and randomly shuffle the treatment states. In the California Proposition 99 case (Abadie et al., 2010), the treatment date is 1988, the treated state is California, and there is a donor pool. For a placebo test, the authors:

“In each iteration we reassign in our data the tobacco control intervention to one of the 38 control states, shifting California to the donor pool. That is, we proceed as if one of the states in the donor pool would have passed a large-scale tobacco control program in 1988, instead of California.”

Then re‑estimate the model and plot the real and synthetic cigarette consumption before and after the treatment date:

As seen in the figure, California lies on the extremely negative side of the distribution.

For a numerical test, compute:

  1. Pre‑treatment prediction error (e.g., t=0T0(Y1tj=2J+1wjYjt)2\sum_{t=0}^{T_{0}} (Y_{1t}-\sum_{j=2}^{J+1} w_{j}Y_{jt})^2, measured as MSE)
  2. Post‑treatment prediction error
  3. The ratio of post‑ to pre‑treatment MSE
  4. Use the distribution of ratios to compute the p‑value for the treated state

Keep treated and untreated states fixed, but choose different treatment dates (typically pretreatment dates). Example: Abadie, Diamond, and Hainmueller (2015).

Nickname
Email
Website
0/500
0 comments
No comment