σ StatsDoge Causal inference workflows
11
Workflow·4 steps·branched

Synthetic control, the tidy way — weights, gaps and placebo inference (tidysynth)

Source tidysynth — Eric Dunford
Summary by StatsDoge

Build the treated unit's counterfactual as a convex blend of donors that matches its pre-treatment trajectory; read the gap as the effect. Inference is non-parametric: re-run the procedure with every donor as the placebo-treated and rank the real gap against that distribution (RMSPE ratio).

1

Input · what goes in

A balanced panel (unit × time) with one treated unit switching on at a known time, a donor pool, and a few pre-treatment predictors.

Show data format & exampleHide example

Format — long panel: unit, time, outcome, predictors; one treated unit after T0.

library(tidysynth)
out <- panel %>%
  synthetic_control(outcome = cigsale, unit = state, time = year,
                    i_unit = 'California', i_time = 1988) %>%
  generate_predictor(...) %>% generate_weights() %>% generate_control()
2

Pipeline · the recipe ⑂ has parallel branches

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Panel, treated unit, donor pool

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Pick the treated unit and intervention time; the rest of the units form the donor pool. Identification rests on a good pre-treatment fit.

Formula
\hat Y^{N}_{1t}= extstyle\sum_{j\ge 2}\hat w_j\,Y_{jt}
Reads from the input data Feeds into the final output
Key code
synthetic_control(outcome = cigsale, unit = state,
                  time = year, i_unit = 'California', i_time = 1988)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

Solve for donor & predictor weights

The core estimate — where the causal quantity itself is computed.

What happens here

Convex weights make the synthetic unit match the treated unit's pre-period predictors as closely as possible.

Formula
\hat W=\arg\min_{W}\ \lVert X_1-X_0W Vert_V\ \ ext{s.t.}\ w_j\ge 0,\ extstyle\sum_j w_j=1
Reads from the input data Feeds into the final output
Key code
generate_weights() %>% generate_control()

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Reporting

Read the gap: observed − synthetic

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

The post-treatment gap between the treated unit and its synthetic twin is the estimated effect over time.

Formula
\hat au_t=Y_{1t}-\hat Y^{N}_{1t}
Reads from the input data Feeds into the final output
Key code
plot_trends(out); plot_differences(out)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Inference

Placebo permutation across donors

Uncertainty quantification — standard errors, intervals, and aggregation.

What happens here

Re-run the whole procedure pretending each donor was treated; rank the real gap against that placebo distribution via the RMSPE ratio.

Formula
r_1=\dfrac{\mathrm{RMSPE}_{ ext{post}}}{\mathrm{RMSPE}_{ ext{pre}}},\qquad p= frac{\#\{j: r_j\ge r_1\}}{J+1}
Reads from the input data Feeds into the final output
Key code
plot_placebos(out); plot_mspe_ratio(out)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get 2 figures

Placebo gaps (Abadie's California Prop-99): the treated unit's gap (magenta) plunges past nearly every donor-pool placebo after the 1988 intervention.
Fig 1Placebo gaps (Abadie's California Prop-99): the treated unit's gap (magenta) plunges past nearly every donor-pool placebo after the 1988 intervention.
How the synthetic control is built: convex weights on donor states (Utah, Nevada, …) and on the predictor variables.
Fig 2How the synthetic control is built: convex weights on donor states (Utah, Nevada, …) and on the predictor variables.

Figures reproduced from tidysynth — Eric Dunford — unofficial community showcase; all credit to the original authors.

⚠️ Unofficial community showcase of tidysynth. Not affiliated with the authors; all credit to them.

Build a synthetic version of the treated unit from a convex blend of donors, read the treated-minus-synthetic gap, and test it against placebos run on every donor.

Discussion (2)

  • 2

    The pipe-friendly API finally makes SC readable end to end. plot_placebos() as the inference step is the part people skip.

  • 0

    Showing the donor and predictor weights side by side is underrated — it's the honest answer to 'where does the counterfactual come from?'.

    1

    Agreed. If one donor carries 0.8 of the weight, the 'synthetic' control is basically that one unit and you should say so.