Allow me to turn off or control any subsampling done within the quality report

### Problem Description
To improve the performance, the [SDMetrics Quality Report](https://docs.sdv.dev/sdmetrics/reports/quality-report/whats-included) may decide to subsample some metrics before running them. For example, the report currently subsamples larger datasets to 50K rows before running the [ContingencySimilarity metric](https://docs.sdv.dev/sdmetrics/metrics/quality-metrics/contingencysimilarity). Since the subsampling is random, it will result in the score being non-deterministic. (Note that with 50K rows, we've verified that the overall score will only be affected by a small percentage.)

Nevertheless, it would be good to expose a control that would allow the user to toggle the subsampling on/off -- especially in the case that they are willing to wait for the full computation and want the full, deterministic score. Alternatively, if subsampling is one, it would be good to control the number of rows to subsample.

### Expected behavior
For single- and multi-table quality reports, each instance should have an attribute that can be modified by the user for subsampling. The attribute should be called: `num_rows_subsample`.
- By default, the attribute should be set to `50000` (50K)
- If the user should change the default, the new value should be used when subsampling the data for any metric's computation
- If the user sets `num_rows_subsample=None`, then no subsampling should be done.

The attribute should only affect that particular instance of the quality report.

```python
from sdmetrics.reports.single_table import QualityReport

# set the subsample to 100K rows instead of 50K
report = QualityReport()
report.num_rows_subsample=100000
report.generate(...)

# alternatively, turn the subsampling off
report2 = QualityReport()
report2.num_rows_subsample=None
report2.generate(...)
```

### Additional context
Currently, the only metric that is subsampled is `ContingencySimilarity`. However, should we decide to subsample any other metrics in the future, they would use the same value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow me to turn off or control any subsampling done within the quality report #790

Problem Description

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow me to turn off or control any subsampling done within the quality report #790

Description

Problem Description

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions