When data augmentation is recommended, should the metrics do the augmentation internally? Or should the user do it beforehand?

### Environment details
* SDMetrics version: 0.21.0

### Background
In certain metrics like [BinaryClassifierPrecisionEfficacy](https://docs.sdv.dev/sdmetrics/metrics/ml-augmentation-metrics/binaryclassifierprecisionefficacy) and [EqualizedOddsImprovement](https://docs.sdv.dev/sdmetrics/metrics/privacy-and-fairness-metrics/equalizedoddsimprovement), the user is generally interested in _augmenting_ the real data with synthetic data. So ultimately these metrics are meant to compare the **real data** with the **augmented data** (aka real + synthetic data).

For these cases, we should decide on a consistent way for the user to input the datasets. Namely, we should decide whether the metric itself should do the augmentation (internally), or whether the user is expected to do it before calling the metric.

### Details

**Alternative A**: The metric should do the augmentation internally. This means that the user would provide the real data and synthetic data individually. Example:
```python
Metric.compute(
  real_data=my_real_dataset,
  synthetic_data=my_synthetic_dataset
)
```

- Pros: The metric will guarantee that the augmentation is done
- Cons: There isn't much flexibility to try out other usages

**Alternative B**: The user should to the augmentation themselves. Then the metric can just compare the 2 datasets it gets directly.
```python
import pandas as pd

my_augmented_dataset = pd.concat([my_real_dataset, my_synthetic_dataset])

Metric.compute(
  real_data=my_real_dataset,
  augmented_dataset=my_augmented_dataset
)
```

- Pros: The metric is more straightforward to explain
- Cons: We cannot guarantee that the augmentation is done (unless the metric itself checks to see whether the real dataset is a subset of the augmented, which adds some complex logic) 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When data augmentation is recommended, should the metrics do the augmentation internally? Or should the user do it beforehand? #779

Environment details

Background

Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

When data augmentation is recommended, should the metrics do the augmentation internally? Or should the user do it beforehand? #779

Description

Environment details

Background

Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions