σ StatsDoge Causal inference workflows
17
σ Building block · used in 3 workflows

Rank-weighted ATE — RATE / AUTOC / Qini

OTHER Causal ForestHeterogeneous Effects
Source grf — Athey, Tibshirani & Wager
Summary by StatsDoge

Evaluate how well a CATE estimate prioritizes treatment: TOC curve, AUTOC and Qini with confidence intervals.

You're looking at a building block — one of the estimators a workflow uses inside its pipeline. You reached it from a workflow step; it's used in 3 workflows (listed below).

TOC curve with the AUTOC area shaded

Figure: TOC curve with the AUTOC area shaded. Source — grf-labs docs.

⚠️ Unofficial community write-up of a method from grf-labs/grf (pinned at v2.6.1). Not affiliated with the grf-labs authors — this summarizes the public documentation for demonstration. All credit & copyright belong to the original authors (Athey, Tibshirani, Wager, et al.).

What it does

Answers 'is my CATE model actually useful for targeting?' Builds the TOC (Targeting Operator Characteristic) curve and summarizes it as AUTOC or Qini, with confidence intervals from a held-out evaluation forest.

rate <- rank_average_treatment_effect(eval.forest, priorities = tau.hat)
plot(rate)            # TOC curve
rate$estimate / rate$std.err   # is targeting better than treating everyone?

Why it matters

A model can have a great AUC and still be useless for prioritization. RATE tests the thing you actually care about. (Yadlowsky et al., JASA 2025.)

Used in these workflows (3)

Discussion (2)

  • 2

    THE method everyone skips and shouldn't. A high AUC CATE model can have an AUTOC indistinguishable from zero — i.e. useless for prioritization. Test the thing you actually deploy.

    5

    We caught a 'great' model that was worthless for targeting exactly this way. Saved a campaign.

  • 3

    AUTOC vs Qini choice matters more than people think: AUTOC for concentrated benefit, Qini when you treat a big fraction. Pick before you peek.