Sophiex/dev/efficient bilin #1684

sophie-xhonneux · 2026-01-23T17:20:20Z

Description

Make the bilinear layer more memory efficient.

Issue Number

Closes #1683

Draft stage, needs option for bias, and reset parameters

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

…ae_aggregation_engine. More checnking needed.

…iex/dev/include-reg-tokens-in-query-agg-engine

…herGenerator into sophiex/dev/include-reg-tokens-in-query-agg-engine

…gg-engine

TODO bias flag reset params clean up

…/dev/include-reg-tokens-in-query-agg-engine

…d_shard() to better support all use cases.

…github.com:ecmwf/WeatherGenerator into sophiex/dev/include-reg-tokens-in-query-agg-engine

I think it still hangs in multi-GPU mode even with just DDP :/

…o sophiex/dev/efficient-bilin

…ty or when is_forcing: True is set

clessig · 2026-01-23T17:39:20Z

src/weathergen/model/engines.py

        return torch.cat(outputs, dim=1)


+class EfficientBilinear(torch.nn.Module):


Is there a reason to retain the old implementation?

Not really!

kctezcan and others added 30 commits January 13, 2026 08:31

WIP added a predictor class

bc19177

using the transformer predictor for jepa

f3d81b8

lint

d3fc692

Merge branch 'develop' into ktezcan/dev/iss1587_predictor_jepa

0d23560

added pred_ params in the test config

37145b3

renamed params to sslpred_

3e9372b

merged develop

1dcd781

lint

c845509

Move register & class tokens to be added earlier

fbba830

Enable multiple student views for one target

0ee6213

Merge branch 'develop' into clessig/develop/fix_jepa_1616

d315800

Try to fix batchsize>1

bfeb2ff

Add configs

f33c1c9

Fixed code so that it's runnning and should be correct up to call to …

ddf3dbe

…ae_aggregation_engine. More checnking needed.

Linting

7e27674

Merge branch 'develop' into ktezcan/dev/iss1587_predictor_jepa

3a00e40

added the only jepa config

2048d16

Enable loss term

7426cf1

Fix globla positional embeddings

56e8cd9

Merge branch 'develop' of github.com:ecmwf/WeatherGenerator into soph…

68717ee

…iex/dev/include-reg-tokens-in-query-agg-engine

Merge branch 'clessig/develop/fix_jepa_1616' of github.com:ecmwf/Weat…

eae0a81

…herGenerator into sophiex/dev/include-reg-tokens-in-query-agg-engine

Move register tokens to after the chunking loop

c7df8cc

Merge branch 'develop' into sophiex/dev/include-reg-tokens-in-query-a…

ef90e1b

…gg-engine

Lint

a338638

Significantly more memory efficient bilienar layer

b987080

TODO bias flag reset params clean up

Merge branch 'develop' into sophiex/kerem/pr/transformer-head

bca9097

Clean-up config and model create

399a97e

Current status

aa2e34a

Now it works

f0dac8c

Refactored different parts in assimilate_local

a5adb3b

kctezcan and others added 23 commits January 21, 2026 15:07

WIP configs

c09d226

jepa and phys+jepa configs running

99ae2cf

WIP config changes

784cd0c

Merge branch 'ktezcan/sophiex/kerem/pr/transformer-head' into sophiex…

21284c9

…/dev/include-reg-tokens-in-query-agg-engine

Fix config make Teacher not use DDP

66331bf

Fix validation

48f9da6

Fix EMAteacher update for EMATeacher with no DDP

d90a665

Re lengthen the mini epoch

dd58768

Fix EMATeacher in single-GPU mode

b4432a0

Reverting default parameter for num_sample as test case

a4d4573

Linting and removing some old code.

9790ea1

Improved robustness of code and cleaned up interface of init_model_an…

68c74b7

…d_shard() to better support all use cases.

Disabled FSDP

c4ba049

Address comments

82fe54c

Add ddp_find_unused_parameters to physical/jepa cfg

42d59ca

More interface cleanup

0e43227

Merge branch 'sophiex/dev/include-reg-tokens-in-query-agg-engine' of …

0bab6cf

…github.com:ecmwf/WeatherGenerator into sophiex/dev/include-reg-tokens-in-query-agg-engine

Fix small bugs for DINOv2

3caab0e

I think it still hangs in multi-GPU mode even with just DDP :/

Merge branch 'sophiex/dev/include-reg-tokens-in-query-agg-engine' int…

850c75f

…o sophiex/dev/efficient-bilin

testing fine-tuning

b87f8e8

Renamed variable to the standard name in the code

a58fdb3

Fixed support for forcing datasets, i.e. when target channels are emp…

b5829da

…ty or when is_forcing: True is set

Linting

8f7f240

github-project-automation bot added this to WeatherGen-dev Jan 23, 2026

clessig reviewed Jan 23, 2026

View reviewed changes

sophie-xhonneux and others added 5 commits January 26, 2026 15:26

Clean up Bilinear layer

98f8602

merge develop in -- sophie please note and check this

0e01238

new configs for nppatms and synop finetuning

511d990

merge of trainer.py -- sophie to check

13af060

update nppatms synop ft cfg

e03a438

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sophiex/dev/efficient bilin #1684

Sophiex/dev/efficient bilin #1684

Uh oh!

sophie-xhonneux commented Jan 23, 2026

Uh oh!

clessig Jan 23, 2026

Uh oh!

sophie-xhonneux Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		return torch.cat(outputs, dim=1)


		class EfficientBilinear(torch.nn.Module):

Sophiex/dev/efficient bilin #1684

Are you sure you want to change the base?

Sophiex/dev/efficient bilin #1684

Uh oh!

Conversation

sophie-xhonneux commented Jan 23, 2026

Description

Issue Number

Checklist before asking for review

Uh oh!

clessig Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

sophie-xhonneux Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants