σ StatsDoge Causal inference workflows
10
Workflow·5 steps

Cross-fold validation of heterogeneity

Source grf — Athey, Tibshirani & Wager
Summary by StatsDoge

K-fold cross-fitted CATEs → RATE on out-of-fold priorities → honest verdict on heterogeneity strength.

1

Input · what goes in

A single dataset (X, W, Y) — no separate evaluation sample needed.

Show data format & exampleHide example

Format — one row per unit. A covariate matrix X (numeric), a binary treatment W ∈ {0,1}, and an outcome Y.

  X1     X2    X3    W    Y
 0.42  -1.1   0     1   3.10
-0.07   0.6   1     0   1.85
 1.20   0.3   0     1   4.02
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Data prep

K-fold split (e.g. K = 5)

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Partition units into K folds; each unit's CATE will be predicted from a forest trained without it.

Reads from the input data Feeds into #2#3
Key code
folds <- sample(rep(1:5, length.out = n))   # cross-fit by fold
Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Estimation

[GRF] Regression forest

The core estimate — where the causal quantity itself is computed.

What happens here

Cross-fit nuisances Y.hat, W.hat across folds.

Formula
\hat\mu(x)=\mathbb{E}\!\left[\,Y\mid X=x\, ight]
The estimator

Regression forest — Honest non-parametric regression for E[Y|X], with out-of-bag predictions and pointwise CIs.

Reads from #1 Feeds into #3
Key code
rf <- regression_forest(X, Y)
Y.hat <- predict(rf)$predictions
Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Estimation

[GRF] Causal forest

The core estimate — where the causal quantity itself is computed.

What happens here

For each fold, fit on the other K−1 and predict τ̂(x) on the held-out fold.

Formula
au(x)=\mathbb{E}\!\left[\,Y(1)-Y(0)\mid X=x\, ight]
The estimator

Causal forest — Honest random forest for heterogeneous treatment effects — CATE for a binary treatment via GRF moment conditions.

Reads from #1#2 Feeds into #4
Key code
cf <- causal_forest(X, Y, W)          # Y.hat, W.hat cross-fit
tau.hat <- predict(cf)$predictions    # OOB CATEs
Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Heterogeneity

[GRF] Rank-weighted ATE — RATE / AUTOC / Qini

Heterogeneity — who is affected, and by how much, not just on average.

What happens here

RATE / AUTOC with the cross-fitted τ̂ as priorities — uses all the data without double-dipping.

Formula
\mathrm{AUTOC}=\int_0^1\!\mathrm{TOC}(q)\,dq,\quad \mathrm{TOC}(q)=\mathbb{E}\!\left[ au(X)\mid \hat S(X)\ge \hat F^{-1}(1-q) ight]-\mathbb{E}[ au(X)]
The estimator

Rank-weighted ATE — RATE / AUTOC / Qini — Evaluate how well a CATE estimate prioritizes treatment: TOC curve, AUTOC and Qini with confidence intervals.

Reads from #3 Feeds into #5
Key code
rate <- rank_average_treatment_effect(eval.forest, priorities = tau.hat)
plot(rate)                          # TOC curve
rate$estimate / rate$std.err        # AUTOC z-stat
Discussion on this step (0)
  • No comments on this step yet — be the first.
5
Reporting

TOC curve + bootstrap CI

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

Plot the curve; report whether AUTOC is bounded away from zero.

Reads from #4 Feeds into the final output
Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get 2 figures

A K=5 fold split: each fold is held out in turn so nothing is evaluated on its own training data.
Fig 1A K=5 fold split: each fold is held out in turn so nothing is evaluated on its own training data.
Cross-fold RATE estimates combined across the folds into one honest summary.
Fig 2Cross-fold RATE estimates combined across the folds into one honest summary.

Figures reproduced from grf — Athey, Tibshirani & Wager — unofficial community showcase; all credit to the original authors.

The GRF 'Cross-fold validation of heterogeneity' tutorial. Sample-splitting RATE wastes half the data; K-fold cross-fitting fixes that while keeping the priorities out-of-sample at every unit. Unofficial summary.

Discussion (0)

  • No comments yet — start the conversation.