Synthetic Control

The weights are transparent. In synthetic control, the contribution of each untreated group is clearly shown, while in regression the contribution is implicit unless you do a Bacon decomposition.
The choice of synthetic control requires no post‑treatment data.
1 Constructing a synthetic control
The following explanation is adapted from Abadie et al. (2010).
is the outcome where and . and are the counterfactual outcomes for the untreated and treated. The treated group is denoted by and the control groups (the donor pool) by . The treatment is applied at where .
is a vector with nonnegative elements. The synthetic control at time is . The treatment effect at is .
The key task is to estimate . Consider a vector . is a vector of covariates for the treated—typically predictors of . is a combination of pre‑treatment . The most obvious choice is , i.e., the in every pre‑treatment year. Conceptually, captures the characteristics of the treated. Note there is no subscript in , indicating it is an average over pre‑treatment periods (e.g., average per‑capita GDP).
Now consider a matrix . is similar to but captures the characteristics of all units in the donor pool. We estimate by minimizing:
i.e., minimizing the distance in characteristics between the treated and the synthetic control. is typically a semi‑positive diagonal matrix whose elements determine the importance of each covariate in . is chosen by:
where is the pre‑treatment trajectory of , i.e., . Eq. (2) says is determined by minimizing the prediction error of the outcome.
Once is given, compute with Eq. (1).
2 Placebo falsification
2.1 Placebo states
Fix the treatment date and randomly shuffle the treatment states. In the California Proposition 99 case (Abadie et al., 2010), the treatment date is 1988, the treated state is California, and there is a donor pool. For a placebo test, the authors:
“In each iteration we reassign in our data the tobacco control intervention to one of the 38 control states, shifting California to the donor pool. That is, we proceed as if one of the states in the donor pool would have passed a large-scale tobacco control program in 1988, instead of California.”
Then re‑estimate the model and plot the real and synthetic cigarette consumption before and after the treatment date:

As seen in the figure, California lies on the extremely negative side of the distribution.
For a numerical test, compute:
- Pre‑treatment prediction error (e.g., , measured as MSE)
- Post‑treatment prediction error
- The ratio of post‑ to pre‑treatment MSE
- Use the distribution of ratios to compute the p‑value for the treated state
2.2 Placebo dates
Keep treated and untreated states fixed, but choose different treatment dates (typically pretreatment dates). Example: Abadie, Diamond, and Hainmueller (2015).