Calculating Welfare Using DR Scores with a Binary Outcome

Hello GRF Lab team,

Thank you for developing such a fantastic package. I want to ask about an issue we encountered when using DR scores to calculate welfare with a binary outcome.

**Setting:**
I'm using a regression discontinuity (RDD) design to study the effect of a health signal, triggered when a diabetes biomarker exceeds a certain threshold, on next year's biomarker levels and mortality. (The average mortality rate in the data is approximately 1%.)

I estimate CATE using `lm_forest` and compute individual DR scores. I then monetize the DR scores for mortality by multiplying them by the Value of Statistical Life (VSL) to calculate welfare.

**Problem:**
While the ATE and GATE estimates are reasonable, we encountered an issue with the welfare estimates: the welfare under the status quo policy assignment (e.g., assigning treatment to individuals with biomarker > c regardless of CATE) is extremely large. The welfare is much higher under this scenario than when treatment is assigned only to individuals with both CATE > 0 and biomarker > c. This result seems strange and misleading.

Upon examining the AIPW scores, we found that the DR scores were mostly positive in the treatment group (W = 1), while in the control group (W = 0), the AIPW scores were mostly negative. As a result, not changing any policy assignment leads to a much higher welfare, which is misleading. The following are images of them.
<p float="left">
　<img src="https://github.com/user-attachments/assets/89386a42-c7c6-42b2-9958-da798d05d4fe" width="45%" />
  <img src="https://github.com/user-attachments/assets/51e73df0-e558-4de6-9eb6-ba2cae01d9a8" width="45%" />
  
</p>

This phenomenon only occurs with the binary outcome of mortality—it does not occur when using a continuous outcome like next-year biomarker levels. Also, the results appear more reasonable if we use IPW instead of AIPW to compute welfare.

**Question:** Is it appropriate to use DR scores to compute welfare when the outcome is binary? If not, can I use alternative approaches or modifications to avoid the abovementioned issue?

As I investigated the cause, I discovered the following:

- The propensity scores are not extreme.
- The adjustment term in the AIPW scores dominates the CATE term.
- This is because the nuisance prediction `Y.hat` is relatively large compared to the binary outcome (where 99% of the values are 0), resulting in a large adjustment term.
- Furthermore, the adjustment term changes sign between the treatment (W=1) and control (W=0) groups, causing a large difference in AIPW scores between these groups.

By the way, I compute the DR scores for the LM forest as follows.
```
lm_get_scores <- function (forest,
                           subset = NULL,
                           debiasing.weights = NULL,
                           num.trees.for.weights = 500, 
                           ...) {
  subset <- grf:::validate_subset(forest, subset)
  W.orig.1 <- forest$W.orig[subset, 1]
  W.hat.1 <- forest$W.hat[subset, 1]
  W.orig.2 <- forest$W.orig[subset, 2]
  W.hat.2 <- forest$W.hat[subset, 2]
  Y.orig <- forest$Y.orig[subset]
  Y.hat <- forest$Y.hat[subset]
  tau.hat.pointwise.1 <- predict(forest)$predictions[subset, 1, ]
  tau.hat.pointwise.2 <- predict(forest)$predictions[subset, 2, ]
  
  Y.residual <- Y.orig - (Y.hat + tau.hat.pointwise.1 * (W.orig.1 - W.hat.1)+ tau.hat.pointwise.2 * (W.orig.2 - W.hat.2))
  tau.hat.pointwise.1 + debiasing.weights * Y.residual
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Calculating Welfare Using DR Scores with a Binary Outcome #1492

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Calculating Welfare Using DR Scores with a Binary Outcome #1492

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions