Skip to content

Comments

fix: apply version filtering and fix supplementary dataset selection in regression analysis#554

Open
lewisjared wants to merge 5 commits intomainfrom
fix/regression-versions
Open

fix: apply version filtering and fix supplementary dataset selection in regression analysis#554
lewisjared wants to merge 5 commits intomainfrom
fix/regression-versions

Conversation

@lewisjared
Copy link
Contributor

Description

Fixes two issues in the dataset selection logic that caused spurious entries in regression analysis results:

  1. Version filtering in dataset queries (datasets/base.py): Added proper version filtering to query_datasets and query_facets methods. Previously, version constraints were silently ignored, causing older dataset versions to leak into results. The version filter supports both exact matching and ordering comparisons (e.g. v20190731 vs v20200101).

  2. Supplementary dataset selection (constraints.py): Fixed AddSupplementaryDataset to pick a single best-matching dataset per group instead of accumulating multiple candidates across score ties. Previously, when multiple supplementary datasets tied on matching score, all were selected, leading to spurious experiment entries (e.g. 1pctCO2, esm-1pct-brch-1000PgC) appearing alongside expected historical and SSP data.

These fixes together eliminate unexpected experiment/version combinations from regression outputs.

Checklist

Please confirm that this pull request has done the following:

  • Tests added
  • Documentation added (where applicable)
  • Changelog item added to changelog/

@lewisjared
Copy link
Contributor Author

@bouweandela what do you think about 1cb40c0 ?

@lewisjared
Copy link
Contributor Author

Closes #545 and #543

@codecov
Copy link

codecov bot commented Feb 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
core 92.51% <100.00%> (+0.04%) ⬆️
providers 89.65% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...imate-ref-core/src/climate_ref_core/constraints.py 96.75% <100.00%> (+0.04%) ⬆️
...kages/climate-ref/src/climate_ref/datasets/base.py 98.47% <100.00%> (+0.04%) ⬆️
...kages/climate-ref/src/climate_ref/solve_helpers.py 98.92% <100.00%> (+0.02%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant