Workflows — StatsDoge

⌥ Pipeline composition stream Real-time node

Sort live stream by

Hottest Newest

⌥ 4 steps ⑂ 1 branch Index: 195 52 peers

10

Draw the DAG, find the adjustment set (ggdag & dagitty)

Before any estimation: encode your assumptions as a causal graph, enumerate the backdoor paths from treatment to outcome, and let the graph hand you the minimal set of covariates to adjust for.

Data prep Encode your assumptions as a DAG Diagnostic / pre-tests Enumerate paths; spot backdoors & collide… Estimation Minimal sufficient adjustment set Robustness check Test the DAG's implications

Adjustment Set Backdoor Adjustment DAG

@ggdag · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 195 86 peers

10

Confounder-adjusted survival curves for a treatment (adjustedCurves)

Compare survival between treatment groups after removing confounding — via IPTW, the g-formula or AIPW — instead of a raw Kaplan-Meier that quietly bakes in selection.

Data prep Time-to-event, treatment, confounders Estimation Adjust for confounding (IPTW / g-formula) Inference Curves with confidence bands Reporting Summaries: RMST difference, survival at t

Adjusted Curves Survival Time-to-Event

@adjustedcurves · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 207 86 peers

11

Mendelian randomization: genes as instruments for a causal effect (TwoSampleMR)

Use genetic variants as instruments to estimate the causal effect of an exposure on an outcome from GWAS summary data — with IVW plus pleiotropy-robust MR-Egger and weighted-median checks.

Data prep Harmonise SNP–exposure & SNP–outcome effe… Diagnostic / pre-tests Check instrument strength Estimation Inverse-variance weighted estimate Robustness check Pleiotropy-robust: MR-Egger, weighted med…

MR-Egger Mendelian Randomization Pleiotropy

@twosamplemr · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 207 93 peers

11

Bayesian regression discontinuity with credible intervals (CausalPy)

Fit a model on each side of the cutoff, put a posterior on the jump, and report a credible interval for the discontinuity — plus an honest look at how it moves with the bandwidth.

Data prep Running variable, threshold, outcome Estimation A Bayesian model each side of the cutoff Inference Posterior & 94% credible interval for the… Robustness check Bandwidth & functional-form sensitivity

Bayesian Credible Interval Regression Discontinuity

@causalpy · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 207 21 peers

11

Synthetic control, the tidy way — weights, gaps and placebo inference (tidysynth)

Build a synthetic version of the treated unit from a convex blend of donors, read the treated-minus-synthetic gap, and test it against placebos run on every donor.

Data prep Panel, treated unit, donor pool Estimation Solve for donor & predictor weights Reporting Read the gap: observed − synthetic Inference Placebo permutation across donors

Donor Pool Placebo Test Synthetic Control

@tidysynth · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 182 16 peers

11

Uplift modelling with S-, T-, X- and R-learners (CausalML)

Estimate who responds, not just the average: fit a family of meta-learners for the CATE, pick the best by validation error, then rank and target with an uplift curve.

Data prep Outcome, treatment, features Estimation Fit a family of meta-learners Heterogeneity Compare learners; choose by validation Reporting Targeting: uplift / Qini gain

Heterogeneous Effects Meta-Learners Uplift Modeling

@causalml · Jun 4, 2026

2 reviews
⌥ 4 steps ⑂ 1 branch Index: 219 99 peers

12

Heterogeneous effects with causal-forest double ML (EconML)

Double machine learning with a forest final stage: partial out nuisance with flexible learners, then read the conditional effect τ(x) — with valid confidence intervals.

Data prep Split features: effect-modifiers X vs con… Estimation Partial out nuisance (Neyman-orthogonal D… Heterogeneity Forest-weighted local effect τ(x) Inference Confidence intervals for τ(x)

CATE Double ML Heterogeneous Effects

@econml · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 219 104 peers

12

Model, identify, estimate, refute — the DoWhy four-step recipe (DoWhy)

Make your assumptions explicit: draw a causal graph, identify the estimand by the backdoor criterion, estimate it, then actively try to refute it with placebo and confounding tests.

Data prep Model — encode the causal graph Diagnostic / pre-tests Identify — apply the backdoor criterion Estimation Estimate — adjust for the backdoor set Robustness check Refute — placebo & unobserved-confounder …

Backdoor Adjustment Propensity Score Refutation

@dowhy · Jun 4, 2026

3 reviews
⌥ 4 steps ⑂ 1 branch Index: 108 22 peers

9

Causal mediation: natural direct & indirect effects (CMAverse)

Split a total effect into what flows through a mediator (indirect) and what doesn't (direct) — with a sensitivity analysis for mediator–outcome confounding.

Data prep Treatment, mediator, outcome, confounders Estimation Fit mediator & outcome models Inference Decompose the total effect Robustness check Sensitivity to mediator–outcome confoundi…

Mediation NDE NIE

@cmaverse · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 96 21 peers

8

Synthetic difference-in-differences (synthdid)

Reweight both control units and pre-periods to build a synthetic control, then apply a DiD correction — robust where plain SC or TWFE struggle.

Data prep Balanced panel + treated block Estimation Solve for unit & time weights Inference Placebo / jackknife standard errors Reporting Plot trajectories & the gap

Synthetic Control Synthetic DiD

@synthdid · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 108 33 peers

9

Sharp regression discontinuity with robust bias correction (rdrobust)

Identify the effect at a cutoff: a local-polynomial RD with an MSE-optimal bandwidth and robust, bias-corrected confidence intervals.

Data prep Running variable, cutoff, outcome Diagnostic / pre-tests rdplot — see the jump Estimation Local-linear RD with bias correction Robustness check Bandwidth & donut sensitivity

Regression Discontinuity

@rdrobust · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 96 64 peers

8

Instrumental variables & 2SLS for an endogenous treatment (ivreg)

When treatment is endogenous, an instrument identifies the complier (LATE) effect via two-stage least squares — after you check the instrument is strong.

Data prep Outcome, endogenous treatment, instrument Diagnostic / pre-tests Check instrument strength (first stage) Estimation Two-stage least squares (ivreg) Inference Interpret as a complier effect (LATE)

Instrumental Variables LATE

@ivreg · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 108 69 peers

9

Design & diagnose a randomized experiment (DeclareDesign)

Specify a study as model–inquiry–data–answer, simulate it, and read its diagnosands — bias, power, coverage — before you run it.

Data prep Declare the model & potential outcomes Estimation Difference-in-means estimator Inference Neyman variance & confidence intervals Diagnostic / pre-tests Diagnose: bias, power, coverage

Neyman Randomized Experiment

@declaredesign · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 96 59 peers

8

Distributional effects: potential quantiles & CVaR (DoubleML)

When the tails matter: estimate potential quantiles and the conditional value-at-risk of a treatment with Neyman-orthogonal scores.

Data prep Build DoubleMLData (y, d, X) Estimation Potential quantiles Estimation Conditional value-at-risk Reporting Plot quantile & CVaR effects

Doubly Robust Machine Learning

@doubleml · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 108 73 peers

9

Dose–response with average potential outcomes (DoubleML APO)

For a multi-valued or continuous treatment: estimate E[Y(d)] at each dose and the contrasts between them, all cross-fitted.

Data prep Declare the multi-valued treatment Estimation Average potential outcome at each level Inference Contrasts between doses Reporting Plot the dose–response curve

Doubly Robust Machine Learning

@doubleml · Jun 4, 2026

0 reviews
⌥ 3 steps ⑂ 1 branch Index: 108 101 peers

9

Learn an interpretable treatment policy (DoubleML policy tree)

Turn debiased CATEs into a rule: fit a shallow, readable decision tree that maximises the doubly-robust policy value.

Estimation Orthogonal scores from an IRM Heterogeneity Fit a depth-limited policy tree Reporting Read the tree & its value

Doubly Robust Heterogeneous Effects Machine Learning

@doubleml · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 120 18 peers

10

Quantile treatment effects of 401(k) eligibility (DoubleML)

Beyond the average: how 401(k) eligibility shifts net financial assets across the whole wealth distribution, estimated orthogonally.

Data prep Build DoubleMLData (net_tfa, e401, X) Estimation Estimate QTEs across the distribution Inference Simultaneous confidence bands Reporting Plot the QTE curve

Doubly Robust Machine Learning

@doubleml · Jun 4, 2026

0 reviews
⌥ 5 steps ⑂ 1 branch Index: 132 78 peers

11

Group & conditional effects with DoubleML (GATE / CATE)

Slice the average effect: Group Average Treatment Effects and a CATE surface from a debiased IRM, with simultaneous confidence bands.

Data prep Build DoubleMLData (y, d, X) Estimation Fit an Interactive Regression Model (IRM) Heterogeneity Group Average Treatment Effects Heterogeneity CATE via a basis expansion Reporting Plot with simultaneous bands

Doubly Robust Heterogeneous Effects Machine Learning

@doubleml · Jun 4, 2026

0 reviews
⌥ 5 steps ⑂ 1 branch Index: 60 32 peers

5

An observational ATE you can defend (balance → estimate → sensitivity)

My checklist for an observational effect: match, prove balance with cobalt, estimate on the matched sample, then quantify hidden-confounding risk with sensemakr.

Data prep Treatment, covariates, outcome Data prep matchit() — nearest-neighbour matching Diagnostic / pre-tests bal.tab() / love.plot() — cobalt Estimation Estimate the ATT on matched data Reporting sensemakr() — robustness value + contours

Matching Propensity Score Sensitivity Analysis

@tianzhuqin · Jun 4, 2026

0 reviews
⌥ 5 steps ⑂ 1 branch Index: 84 94 peers

7

Staggered DiD done three ways (did · did2s · fixest)

My side-by-side for staggered adoption: Callaway–Sant'Anna vs Gardner's two-stage vs Sun–Abraham — do the event studies agree?

Data prep Build the staggered panel Estimation att_gt() — Callaway & Sant'Anna Estimation did2s() — Gardner two-stage Estimation sunab() — Sun & Abraham (fixest) Reporting Overlay the three event studies

DiD Event Study

@tianzhuqin · Jun 4, 2026

0 reviews
⌥ 4 steps ⑂ 1 branch Index: 183 35 peers

9

Sensitivity analysis for unobserved confounding (sensemakr)

Don't just assume no unobserved confounding — quantify it: robustness value + contour plots benchmarked against your real covariates.

Data prep Fit the OLS outcome model

▼

Estimation [sensemakr] Sensitivity to unobserved con…

▼

Reporting Contour plot — point estimate Reporting Contour plot — t-value

Propensity Score Sensitivity Analysis

@sensemakr · Jun 3, 2026

3 reviews
⌥ 4 steps Index: 183 62 peers

9

Two-stage difference-in-differences (did2s)

Gardner's 2-stage estimator for staggered DiD: residualize on the untreated, then estimate the event study — fast and timing-robust.

Data prep Staggered panel + relative event time

▼

Estimation [did2s] Two-stage difference-in-differenc…

▼

Robustness check Compare to TWFE / CS

▼

Reporting Event-study plot

DiD Event Study

@did2s · Jun 3, 2026

3 reviews
⌥ 4 steps Index: 207 45 peers

11

Matching for causal inference (MatchIt)

Preprocess by matching so groups are comparable, check balance, then estimate the effect on the matched sample — design before analysis.

Data prep Treatment W + covariates X

▼

Estimation [MatchIt] Matching for causal inference —…

▼

Diagnostic / pre-tests Assess balance (summary / plot)

▼

Reporting Estimate the effect on matched data

Matching Propensity Score

@matchit · Jun 3, 2026

3 reviews
⌥ 5 steps ⑂ 1 branch Index: 244 15 peers

12

Double machine learning for the 401(k) effect (DoubleML)

Effect of 401(k) eligibility on net assets via PLR / IRM / IIVM with cross-fit ML nuisances — four learners, one honest comparison.

Data prep Build DoubleMLData (y, d, X, z)

▼

Data prep Choose ML learners for the nuisances

▼

Estimation [DoubleML] Double/debiased ML — PLR / IRM… Robustness check IRM / IIVM cross-checks

▼

Reporting Coefficient comparison plot

Doubly Robust Machine Learning

@doubleml · Jun 3, 2026

4 reviews
⌥ 4 steps ⑂ 1 branch Index: 195 79 peers

10

Event-study DiD with Sun & Abraham (fixest)

Fast fixed-effects event study that survives staggered timing — sunab() vs naive TWFE, plotted against the truth.

Data prep Assemble panel with cohort timing

▼

Estimation [fixest] Sun & Abraham event study — suna… Robustness check Naive TWFE comparison

▼

Reporting iplot(): SA20 vs TWFE vs truth

DiD Event Study Fixed Effects

@fixest · Jun 3, 2026

3 reviews
⌥ 5 steps ⑂ 1 branch Index: 231 101 peers

13

Difference-in-differences with multiple periods (did)

Staggered-adoption DiD done right: group-time ATT(g,t) → event-study / group / calendar aggregations, with honest pre-trends.

Data prep Build the staggered panel

▼

Estimation [did] Group-time ATT — att_gt()

▼

Inference aggte(type = 'dynamic') Heterogeneity aggte(type = 'group')

▼

Reporting ggdid() event-study plot

DiD Event Study

@did · Jun 3, 2026

3 reviews
⌥ 5 steps ⑂ 1 branch Index: 96 84 peers

8

Qini curves: automatic cost-benefit analysis

From CATEs to a budgeted treatment policy: causal forest → DR scores → cost matrix → maq Qini curve → pick the budget.

Estimation [GRF] Causal forest

▼

Data prep Doubly-robust score matrix Data prep Cost matrix

▼

Heterogeneity [GRF] Multi-armed Qini curves (maq)

▼

Reporting Pick the budget; report the gain

Causal Forest Heterogeneous Effects

@grf · Jun 2, 2026

0 reviews
⌥ 5 steps ⑂ 1 branch Index: 108 87 peers

9

Smooth signals with a local linear forest

When the conditional mean is smooth: regression forest baseline → ll_regression_forest → tuning → diagnostics.

Estimation [GRF] Regression forest Estimation [GRF] Local linear forest

▼

Diagnostic / pre-tests Tune λ via cross-validation

▼

Diagnostic / pre-tests Calibration & boundary plot

▼

Reporting Side-by-side comparison

Machine Learning Random Forest

@grf · Jun 2, 2026

0 reviews
⌥ 5 steps Index: 120 98 peers

10

Cross-fold validation of heterogeneity

K-fold cross-fitted CATEs → RATE on out-of-fold priorities → honest verdict on heterogeneity strength.

Data prep K-fold split (e.g. K = 5)

▼

Estimation [GRF] Regression forest

▼

Estimation [GRF] Causal forest

▼

Heterogeneity [GRF] Rank-weighted ATE — RATE / AUTOC / …

▼

Reporting TOC curve + bootstrap CI

Causal Forest Heterogeneous Effects

@grf · Jun 2, 2026

0 reviews
⌥ 6 steps ⑂ 1 branch Index: 132 42 peers

11

Evaluating a causal forest fit

Did the forest actually capture treatment-effect heterogeneity? Calibration → variable importance → BLP → omnibus tests.

Estimation [GRF] Causal forest

▼

Diagnostic / pre-tests test_calibration() Diagnostic / pre-tests variable_importance() Heterogeneity best_linear_projection() Diagnostic / pre-tests OOB residual checks

▼

Reporting Fit-evaluation report

Causal Forest Heterogeneous Effects

@grf · Jun 2, 2026

0 reviews
⌥ 5 steps ⑂ 1 branch Index: 194 103 peers

12

An introduction to GRF (getting started)

A minimal first-contact recipe: regression forest, quantile forest, and a causal forest on the same data.

Data prep Assemble X, W, Y; check overlap

▼

Estimation [GRF] Regression forest Estimation [GRF] Quantile forest Estimation [GRF] Causal forest

▼

Reporting OOB predictions & variable importance

Causal Forest Machine Learning Random Forest

@grf · Jun 2, 2026

2 reviews
⌥ 3 steps Index: 96 20 peers

8

Estimating ATEs on a new target population

Train a causal forest on the source sample → reweight AIPW to a target population → report transported ATE.

Estimation [GRF] Causal forest

▼

Inference [GRF] AIPW average treatment effect

▼

Reporting Transported ATE + overlap caveats

Causal Forest Doubly Robust

@grf · Jun 2, 2026

0 reviews
⌥ 5 steps Index: 183 61 peers

9

Policy learning via optimal decision trees

Causal forest → doubly-robust scores → policytree → evaluate policy value → plot the tree.

Estimation [GRF] Causal forest

▼

Data prep double_robust_scores()

▼

Estimation policytree: depth-2 optimal tree

▼

Inference Evaluate policy value (held-out)

▼

Reporting Plot the learned decision tree

Causal Forest Heterogeneous Effects

@grf · Jun 2, 2026

3 reviews
⌥ 5 steps ⑂ 1 branch Index: 170 30 peers

10

Causal forest with time-to-event data (survival)

Censoring check → causal survival forest → RMST-scale AIPW ATE → calibration → report.

Diagnostic / pre-tests [GRF] Survival forest

▼

Estimation [GRF] Causal survival forest

▼

Inference [GRF] AIPW average treatment effect Diagnostic / pre-tests test_calibration()

▼

Reporting RMST difference by subgroup

Causal Forest Heterogeneous Effects Survival

@grf · Jun 2, 2026

2 reviews
⌥ 6 steps ⑂ 1 branch Index: 207 23 peers

11

Assessing heterogeneity with RATE (AUTOC & Qini)

Causal forest → train/eval split → RATE with both AUTOC and Qini → TOC plot.

Data prep [GRF] Regression forest

▼

Estimation [GRF] Causal forest

▼

Data prep Train / evaluation split

▼

Heterogeneity [GRF] Rank-weighted ATE — RATE / AUTOC / … Heterogeneity [GRF] Rank-weighted ATE — RATE / AUTOC / …

▼

Reporting TOC plot + AUTOC/Qini table

Causal Forest Heterogeneous Effects

@grf · Jun 2, 2026

3 reviews
⌥ 8 steps ⑂ 1 branch Index: 292 48 peers

16

Heterogeneous treatment effects with a causal forest (GRF recipe)

The full GRF HTE playbook: cross-fit nuisances → causal forest → calibration → AIPW ATE → BLP → RATE → policy.

Data prep [GRF] Regression forest

▼

Estimation [GRF] Causal forest

▼

Diagnostic / pre-tests test_calibration() Inference [GRF] AIPW average treatment effect Heterogeneity best_linear_projection() Heterogeneity [GRF] Rank-weighted ATE — RATE / AUTOC / …

▼

Robustness check Policy learning (policytree)

▼

Reporting CATE histogram + targeting report

Causal Forest Doubly Robust Heterogeneous Effects

@grf · Jun 2, 2026

4 reviews