σ StatsDoge Causal inference workflows
9
Workflow·5 steps

Policy learning via optimal decision trees

Source grf — Athey, Tibshirani & Wager
Summary by StatsDoge

Causal forest → doubly-robust scores → policytree → evaluate policy value → plot the tree.

1

Input · what goes in

Per-unit CATEs / doubly-robust scores from a causal forest, plus a cost.

Show data format & exampleHide example

Format — one row per unit. A covariate matrix X (numeric), a binary treatment W ∈ {0,1}, and an outcome Y.

  X1     X2    X3    W    Y
 0.42  -1.1   0     1   3.10
-0.07   0.6   1     0   1.85
 1.20   0.3   0     1   4.02
2

Pipeline · the recipe

↑ Click any step in the diagram to read its logic, code, assumptions & discussion.

1
Estimation

[GRF] Causal forest

The core estimate — where the causal quantity itself is computed.

What happens here

Estimate CATEs and the AIPW score for each unit.

Formula
au(x)=\mathbb{E}\!\left[\,Y(1)-Y(0)\mid X=x\, ight]
The estimator

Causal forest — Honest random forest for heterogeneous treatment effects — CATE for a binary treatment via GRF moment conditions.

Reads from the input data Feeds into #2
Key code
cf <- causal_forest(X, Y, W)          # Y.hat, W.hat cross-fit
tau.hat <- predict(cf)$predictions    # OOB CATEs
Discussion on this step (0)
  • No comments on this step yet — be the first.
2
Data prep

double_robust_scores()

Data preparation — shapes the raw inputs into what the estimator expects.

What happens here

Build the doubly-robust reward matrix policytree maximizes over.

Formula
\Gamma_i=\hat\mu_1(X_i)-\hat\mu_0(X_i)+ frac{W_i(Y_i-\hat\mu_1)}{\hat e(X_i)}- frac{(1-W_i)(Y_i-\hat\mu_0)}{1-\hat e(X_i)}
Reads from #1 Feeds into #3
Key code
dr.scores <- double_robust_scores(cf)   # n × K reward matrix

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
3
Estimation

policytree: depth-2 optimal tree

The core estimate — where the causal quantity itself is computed.

What happens here

Learn an interpretable assignment rule that maximizes expected welfare.

Formula
\hat\pi=\arg\max_{\pi\in\Pi}\ frac1n extstyle\sum_i\Gamma_iig(\pi(X_i)ig)
Reads from #2 Feeds into #4#5
Key code
library(policytree)
tree <- policy_tree(X, dr.scores, depth = 2)
plot(tree)

Reference / docs ↗

Discussion on this step (0)
  • No comments on this step yet — be the first.
4
Inference

Evaluate policy value (held-out)

Uncertainty quantification — standard errors, intervals, and aggregation.

What happens here

Estimate the value of the learned rule vs treat-all / treat-none on held-out data.

Formula
V(\pi)=\mathbb{E}\!\left[\,\Gamma(\pi(X))\, ight]
Reads from #3 Feeds into #5
Key code
# value of the learned rule vs treat-all / treat-none on held-out data
mean(dr.scores[cbind(seq_len(n), predict(tree, X))])
Discussion on this step (0)
  • No comments on this step yet — be the first.
5
Reporting

Plot the learned decision tree

Reporting — turn the numbers into a figure or table a reader can act on.

What happens here

Show the tree and the value comparison.

Reads from #3#4 Feeds into the final output
Discussion on this step (0)
  • No comments on this step yet — be the first.
3

Output · what you get

The learned treatment rule: a shallow decision boundary splitting treat vs do-not-treat in covariate space.
Fig 1The learned treatment rule: a shallow decision boundary splitting treat vs do-not-treat in covariate space.

Figures reproduced from grf — Athey, Tibshirani & Wager — unofficial community showcase; all credit to the original authors.

The GRF + policytree tutorial: turn CATEs into an interpretable, near-optimal treatment rule and honestly evaluate its value. Unofficial summary; policytree is a separate grf-labs package.

Discussion (2)

  • 4

    Depth-2 trees hit the sweet spot: near-optimal but you can actually explain the rule to ops. The held-out value evaluation keeps you honest.

    5

    The 'evaluate vs treat-all/treat-none' baseline is what convinces stakeholders. Always include it.

  • 5

    Doubly-robust scores as the policytree reward is the key link. Garbage scores → garbage policy.