σ StatsDoge Causal inference workflows
11
Workflow·4 steps·branched

Bayesian regression discontinuity with credible intervals (CausalPy)

Source CausalPy — PyMC Labs
Summary by StatsDoge

Fit a flexible trend on each side of the cutoff with a treatment indicator; the indicator's full posterior is the discontinuity. Reports a 94% credible interval and refits on a narrow window to show how the estimate moves with bandwidth — the whole RD ballgame.

1

Input · what goes in

A running variable with a known threshold, a binary treatment that switches on at the cutoff, and an outcome.

Show data format & exampleHide example

Formatx (running variable), y (outcome), treatment = 1{x ≥ cutoff}.

import causalpy as cp
result = cp.pymc_experiments.RegressionDiscontinuity(
    df, formula='y ~ 1 + x + treated', running_variable_name='x',
    treatment_threshold=0.5)
result.plot()
2

Pipeline · the recipe ⑂ has parallel branches

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

Running variable, threshold, outcome

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Treatment switches on at the cutoff c. Identification rests on continuity of everything else at c.

Formula
au=\lim_{x\downarrow c}\mathbb E[Y\mid X{=}x]-\lim_{x\uparrow c}\mathbb E[Y\mid X{=}x]
Reads from the input data Feeds into the final output
Key code
import causalpy as cp
# x running variable, cutoff c, treated = (x >= c)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

A Bayesian model each side of the cutoff

The core estimate — where the causal quantity itself is computed.

What happens here

Fit a flexible trend with a treatment indicator; the coefficient on the indicator is the discontinuity, with a prior and full posterior.

Formula
Y\sim\mathcal Nig(f(X)+ au\,\mathbb 1\{X\ge c\},\ \sigma^2ig)
Reads from the input data Feeds into the final output
Key code
result = cp.pymc_experiments.RegressionDiscontinuity(
    df, formula='y ~ 1 + x + treated',
    running_variable_name='x', treatment_threshold=0.5)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Inference

Posterior & 94% credible interval for the jump

Uncertainty quantification — standard errors, intervals, and aggregation.

What happens here

Instead of a single number ± SE, you get the whole posterior of the discontinuity — summarised as a credible interval.

Formula
p( au\mid ext{data})\propto p( ext{data}\mid au)\,p( au);\quad ext{report }CI_{94\%}
Reads from the input data Feeds into the final output
Key code
result.summary()   # discontinuity, CI_94%
result.plot()

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Robustness check

Bandwidth & functional-form sensitivity

A robustness check — does the headline result survive a different lens?

What happens here

Refit on a narrow window and with different trends; a discontinuity that survives is one you can defend.

Formula
\hat au(h)\ ext{stable across bandwidths }h\ \Rightarrow\ ext{credible}
Reads from the input data Feeds into the final output
Key code
cp.pymc_experiments.RegressionDiscontinuity(
    df, ..., epsilon=0.2)   # restrict near the cutoff

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get 2 figures

Bayesian RD: the posterior-mean fit with a 94% credible band on each side of the cutoff — the jump at the red threshold is the estimated effect.
Fig 1Bayesian RD: the posterior-mean fit with a 94% credible band on each side of the cutoff — the jump at the red threshold is the estimated effect.
A bandwidth-restricted refit using only points near the cutoff: the discontinuity estimate shifts — exactly why bandwidth is a robustness check, not a footnote.
Fig 2A bandwidth-restricted refit using only points near the cutoff: the discontinuity estimate shifts — exactly why bandwidth is a robustness check, not a footnote.

Figures reproduced from CausalPy — PyMC Labs — unofficial community showcase; all credit to the original authors.

⚠️ Unofficial community showcase of causalpy. Not affiliated with the authors; all credit to them.

Fit a model on each side of the cutoff, put a posterior on the jump, and report a credible interval for the discontinuity — plus an honest look at how it moves with the bandwidth.

Discussion (2)

  • 2

    A full posterior on the discontinuity beats a point estimate ± SE for communicating uncertainty to non-stats stakeholders.

  • 0

    Nice that the bandwidth refit is built in as a robustness step. The estimate moving with the window is the whole RD ballgame.

    1

    Exactly — I always show the narrow-window and full-sample fits next to each other so nobody thinks the number is bandwidth-free.