fix: filteringlosswrapper with target variables #840

gabrieloks · 2026-01-28T11:06:30Z

Description

Working fix for loss filtering with target variables

What problem does this change solve?

What issue or task does this change relate to?

Additional notes

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

for more information, see https://pre-commit.ci

HCookie · 2026-01-28T18:19:47Z

training/src/anemoi/training/train/tasks/base.py

+        if y_postprocessed.shape[3] != y_pred_postprocessed.shape[3]:  # maybe 3 shouldn't be hardcoded

+            y_postprocessed = y_postprocessed[..., self.mask_target]


Definitely not, is it to index the variable dimension? If so we have an enum for this

Var dim could also just be -1, this one rarely changes

training/tests/unit/losses/test_filtering_with_target_only_variables.py

OpheliaMiralles · 2026-01-28T20:49:52Z

I am not sure to be a fan of the idea of inserting an additional indexing layer when all losses can be wrapped by default with predicted indices by the model. I think if this is an additional flexibility to compare predicted variables to any variable in the input data it should be somehow embedded in the way we compute the loss and not a quick fix at the task level. Are you sure there is no way to make it work in a closer way to the original PR?

gabrieloks · 2026-01-29T10:58:44Z

I am not sure to be a fan of the idea of inserting an additional indexing layer when all losses can be wrapped by default with predicted indices by the model. I think if this is an additional flexibility to compare predicted variables to any variable in the input data it should be somehow embedded in the way we compute the loss and not a quick fix at the task level. Are you sure there is no way to make it work in a closer way to the original PR?

Thanks for the feedback. I understand the concern about quick fixes at the task level, but I think having a working solution with good test coverage is a reasonable first step. It makes any deeper refactoring, if needed, safer to attempt in the future.

The issue isn't really limited to the losses, it's more about whether the target gets reduced from the start or not. The architecture seems to have been designed with early reduction in mind, and several things break without it. I already had to fix the scalers and normalizers to handle the potential shape discrepancy between targets and predictions when target-only variables are used, so I preferred to keep things on the safe side rather than changing the core behaviour. So I would prefer not to reduce the targets, but if we can show that the scalers, normalizers and validation can work without big changes, even if the targets are not reduced, I'm down for it. I would prefer more a surgical approach here. The idea I had here was to try and fix the filtering but minimising the changes everywhere else.

The reindexing isn't really an additional layer, it's just accounting for how target variables get reduced via [data_indices.data.output.full]. When you have original indices like [0, 1, 2, 3, 8, ...] after reduction, they need to be mapped to sequential positions [0, 1, 2, 3, 4, ...] to match the actual tensor shape. That's all this does.

The tests I added document the expected behaviour, but I'm open to discussion. Also if you think there's an easier way of doing this that I might have missed, feel free to go with it

OpheliaMiralles · 2026-01-29T11:55:46Z

I am not sure to be a fan of the idea of inserting an additional indexing layer when all losses can be wrapped by default with predicted indices by the model. I think if this is an additional flexibility to compare predicted variables to any variable in the input data it should be somehow embedded in the way we compute the loss and not a quick fix at the task level. Are you sure there is no way to make it work in a closer way to the original PR?

Thanks for the feedback. I understand the concern about quick fixes at the task level, but I think having a working solution with good test coverage is a reasonable first step. It makes any deeper refactoring, if needed, safer to attempt in the future.

The issue isn't really limited to the losses, it's more about whether the target gets reduced from the start or not. The architecture seems to have been designed with early reduction in mind, and several things break without it. I already had to fix the scalers and normalizers to handle the potential shape discrepancy between targets and predictions when target-only variables are used, so I preferred to keep things on the safe side rather than changing the core behaviour. So I would prefer not to reduce the targets, but if we can show that the scalers, normalizers and validation can work without big changes, even if the targets are not reduced, I'm down for it. I would prefer more a surgical approach here. The idea I had here was to try and fix the filtering but minimising the changes everywhere else.

The reindexing isn't really an additional layer, it's just accounting for how target variables get reduced via [data_indices.data.output.full]. When you have original indices like [0, 1, 2, 3, 8, ...] after reduction, they need to be mapped to sequential positions [0, 1, 2, 3, 4, ...] to match the actual tensor shape. That's all this does.

The tests I added document the expected behaviour, but I'm open to discussion. Also if you think there's an easier way of doing this that I might have missed, feel free to go with it

Of course I did not mean to be dismissive of your changes or anything, I’m actually glad someone cares about this :). I’d just be grateful if you could send the exact config you are using in a flatten format by dm on slack and I’ll try to reproduce the issue see if I can see why wrapping at the loss level is trickier than it looks (I can use different data no problem). If your changes are actually more minimal and they pass, we would go your way. Is this ok for you?

gabrieloks · 2026-01-29T15:48:58Z

I am not sure to be a fan of the idea of inserting an additional indexing layer when all losses can be wrapped by default with predicted indices by the model. I think if this is an additional flexibility to compare predicted variables to any variable in the input data it should be somehow embedded in the way we compute the loss and not a quick fix at the task level. Are you sure there is no way to make it work in a closer way to the original PR?

Thanks for the feedback. I understand the concern about quick fixes at the task level, but I think having a working solution with good test coverage is a reasonable first step. It makes any deeper refactoring, if needed, safer to attempt in the future.
The issue isn't really limited to the losses, it's more about whether the target gets reduced from the start or not. The architecture seems to have been designed with early reduction in mind, and several things break without it. I already had to fix the scalers and normalizers to handle the potential shape discrepancy between targets and predictions when target-only variables are used, so I preferred to keep things on the safe side rather than changing the core behaviour. So I would prefer not to reduce the targets, but if we can show that the scalers, normalizers and validation can work without big changes, even if the targets are not reduced, I'm down for it. I would prefer more a surgical approach here. The idea I had here was to try and fix the filtering but minimising the changes everywhere else.
The reindexing isn't really an additional layer, it's just accounting for how target variables get reduced via [data_indices.data.output.full]. When you have original indices like [0, 1, 2, 3, 8, ...] after reduction, they need to be mapped to sequential positions [0, 1, 2, 3, 4, ...] to match the actual tensor shape. That's all this does.
The tests I added document the expected behaviour, but I'm open to discussion. Also if you think there's an easier way of doing this that I might have missed, feel free to go with it

Of course I did not mean to be dismissive of your changes or anything, I’m actually glad someone cares about this :). I’d just be grateful if you could send the exact config you are using in a flatten format by dm on slack and I’ll try to reproduce the issue see if I can see why wrapping at the loss level is trickier than it looks (I can use different data no problem). If your changes are actually more minimal and they pass, we would go your way. Is this ok for you?

Yeah sure :) I committed to original fix/filtering... branch the batch of tests I have working on this second branch. You could argue that most of the tests are secondary but test_forward_multiple_vars_and_target_only. This is really similar to what I try to do with era5+imerg. In this test most of the asserts assume that the target is reduced, but they can be commented. The most important part is getting the final assert on the value of the loss right.

working fix for loss filtering with target variables

481f6b9

github-project-automation bot added this to Anemoi-dev Jan 28, 2026

github-project-automation bot moved this to To be triaged in Anemoi-dev Jan 28, 2026

gabrieloks requested a review from OpheliaMiralles January 28, 2026 11:06

github-actions bot added training models labels Jan 28, 2026

pre-commit-ci bot and others added 4 commits January 28, 2026 11:06

[pre-commit.ci] auto fixes from pre-commit.com hooks

138f48a

for more information, see https://pre-commit.ci

tests and scaler fix

c092e4a

[pre-commit.ci] auto fixes from pre-commit.com hooks

b45d798

for more information, see https://pre-commit.ci

fix tests

adb47ce

HCookie changed the title ~~[FIX] fix: filteringlosswrapper~~ fix: filteringlosswrapper with target variables Jan 28, 2026

HCookie assigned gabrieloks Jan 28, 2026

HCookie moved this from To be triaged to Now In Progress in Anemoi-dev Jan 28, 2026

HCookie reviewed Jan 28, 2026

View reviewed changes

fix tests

f1724a7

gabrieloks added 2 commits January 29, 2026 16:35

improved tests

1db5f6d

improved description tests

57a9208

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: filteringlosswrapper with target variables #840

fix: filteringlosswrapper with target variables #840

Uh oh!

gabrieloks commented Jan 28, 2026

Uh oh!

HCookie Jan 28, 2026

Uh oh!

OpheliaMiralles Jan 29, 2026

Uh oh!

Uh oh!

OpheliaMiralles commented Jan 28, 2026

Uh oh!

gabrieloks commented Jan 29, 2026

Uh oh!

OpheliaMiralles commented Jan 29, 2026

Uh oh!

gabrieloks commented Jan 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		if y_postprocessed.shape[3] != y_pred_postprocessed.shape[3]: # maybe 3 shouldn't be hardcoded

		y_postprocessed = y_postprocessed[..., self.mask_target]

fix: filteringlosswrapper with target variables #840

Are you sure you want to change the base?

fix: filteringlosswrapper with target variables #840

Uh oh!

Conversation

gabrieloks commented Jan 28, 2026

Description

What problem does this change solve?

What issue or task does this change relate to?

Additional notes

Uh oh!

HCookie Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

OpheliaMiralles Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

OpheliaMiralles commented Jan 28, 2026

Uh oh!

gabrieloks commented Jan 29, 2026

Uh oh!

OpheliaMiralles commented Jan 29, 2026

Uh oh!

gabrieloks commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gabrieloks commented Jan 29, 2026 •

edited

Loading