σ StatsDoge Causal inference workflows
8
Workflow·4 steps

Covariate balance for matching & weighting (cobalt)

Source cobalt — Noah Greifer
Summary by StatsDoge

Before you trust an observational estimate, prove balance: SMDs, overlap, and a Love plot before vs after adjustment.

1

Input · what goes in

A binary treatment and the covariates you'll adjust for (e.g. the Lalonde job-training data).

Show data format & exampleHide example

Format — one row per unit: treatment W ∈ {0,1} and covariates X.

  W   age  educ   race   re74    re75
  1    37    11   black     0       0
  0    30    12   white  4100    3800
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Treatment W + covariates X

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Binary treatment plus the covariates to balance (Lalonde: age, educ, race, re74, re75).

Reads from the input data Feeds into #2
Key code
data("lalonde", package = "cobalt")
Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Data prep

Estimate weights / matches (WeightIt / MatchIt)

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Fit a propensity model; produce IPW weights or matched sets.

Reads from #1 Feeds into #3
Key code
library(WeightIt)
w <- weightit(treat ~ age + educ + race + re74 + re75, lalonde, estimand = "ATT")

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Diagnostic / pre-tests

[cobalt] Balance tables & Love plots — bal.tab()

A pre-flight check — run this before trusting any estimate downstream.

What happens here

bal.tab(): standardized mean differences, adjusted vs unadjusted.

Formula
\mathrm{SMD}=\dfrac{ar X_{ ext{treat}}-ar X_{ ext{ctrl}}}{\sqrt{(s^2_{ ext{treat}}+s^2_{ ext{ctrl}})/2}}
The estimator

Balance tables & Love plots — bal.tab() — Assess covariate balance before/after matching or weighting: standardized mean differences, KS stats, and the publication-ready Love plot.

Reads from #2 Feeds into #4
Key code
bal.tab(w, un = TRUE)
Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Reporting

love.plot()

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

Love plot of |SMD| with the 0.1 balance threshold; show before vs after.

Reads from #3 Feeds into the final output
Key code
love.plot(w, thresholds = c(m = 0.1))

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get 4 figures

A Love plot of |standardized mean difference| per covariate, adjusted vs unadjusted, against the 0.1 threshold.
Fig 1A Love plot of |standardized mean difference| per covariate, adjusted vs unadjusted, against the 0.1 threshold.
Distributional balance for a covariate before vs after adjustment.
Fig 2Distributional balance for a covariate before vs after adjustment.
Covariate balance across the sample, adjusted vs unadjusted.
Fig 3Covariate balance across the sample, adjusted vs unadjusted.
Overlap / propensity distribution between treatment groups.
Fig 4Overlap / propensity distribution between treatment groups.

Figures reproduced from cobalt — Noah Greifer — unofficial community showcase; all credit to the original authors.

From the cobalt vignette (Lalonde). Estimate weights/matches, then check that adjustment actually balanced the covariates. Unofficial summary.

Discussion (2)

  • 3

    A causal estimate from observational data without a Love plot is a vibe, not evidence. cobalt makes the balance check trivial.

    3

    100%. I put love.plot() in every appendix now. Reviewers love it.

  • 4

    Works with MatchIt, WeightIt, raw weights… one balance API for everything. Underrated.