implement evaluation routine for SSL #1747

iluise · 2026-01-29T13:54:34Z

Description

add code for standard inference + evaluation for jepa/dinov3 etc..
usage:

agpu
uv run ssl_analysis --run-id <run id> (optional: -- verbose)

Issue Number

Closes #1746

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

* rm model_forward assignment in val * rm clutter from diffusion branch * reverse if order

* Fix bug with diagnostic streams * Avoid that empty decoders are allocated

* Doing something wrong * Make fine-tuning work * Rename sensibly

* Enable multiple student views for one target * Improved readability

* add pin mem to IOReaderData * add pin mem to sample & modelbatch class * add pin mem to stream data * add pin mem to training loop * run /scripts/actions.sh lint * run ./scripts/actions.sh unit-test * ignore check torch import in package * move pinning to MultiStreamDataSampler * add _pin_tensor & _pin_tensor_list helper func * ruff the code * move back pin mem. to train loop * Remove the ignore-import-error rule and revert to the state before the change * create protocol for pinnable obj * remove pin_mem from IOReaderData class * add pin_memory to Trainer.validate * remove pin_memory from loader_params * Rever export/export_inference.py to state before c3fc9a7 * change name * revise Pinnable class description * add memory_pinning in config, train & va loop * use getattr to avoid CICD warning * use setattr to avoid CICD warning * disable pylint for self.source_tokens_lens * Fixed issues with memory pinning due to rebasing and also adjusted config position of flag * Reverting unadvert changes --------- Co-authored-by: Javad Kasravi <[email protected]> Co-authored-by: Javad Kasravi <[email protected]> Co-authored-by: Javad kasravi <[email protected]>

…tructured (#1653)

* split WeatherGenReader functionality to allow reading only JSON adding weathergen JSON reader to develop * informative error when metrics are not there * restore JSONreader after rebase * JSONreader mostly restored * MLFlow logging independent of JSON/zarr * linting, properly cheking fsteps, ens, samples in JSONreader * tiny change to restore the MergeReader * lint * enabling JSONreader to skip plots and missing scores gracefully * required reformatting * move skipping of metrics to the reader class * slighly more explicit formulations --------- Co-authored-by: Sebastian Buschow <[email protected]> Co-authored-by: Sebastian Buschow <[email protected]> Co-authored-by: iluise <[email protected]> Co-authored-by: Ilaria Luise <[email protected]>

* Add target type value error * Remove type * Remove unused code * Commit what shall have been committed * Remove target readout type from config * Add computing stream names to embedding engine --------- Co-authored-by: Christian Lessig <[email protected]>

* add default streams + fix lead time error * update config * Correct a bug creating aggr issues on scores (#1685) --------- Co-authored-by: Savvas Melidonis <[email protected]>

* add default streams + fix lead time error * update config * update ratio plots and bar plots for single run * fix title * Update config Added support information for forecast_step configuration. --------- Co-authored-by: Savvas Melidonis <[email protected]>

* add argument * check stage argument * removed unnecessary code * arbitrary position arguments * Fix error text * get stage info from environment variable. * Update run_train.py --------- Co-authored-by: Simon Grasse <[email protected]>

* caching get_shared_wg_path() * renaming get_path_output to get_path_results * model and results paths from get_shared_wg_path() and removed _get_config_attribute() * marking get_shared_wg_path() as private * removing set_path() * fixed call to _get_shared_wg_path * fixed import, code clean-up, change caching decorator * changed way of caching _get_shared_wg_base_path * fixed typing error * changes in Refactor shared WG path handling and model config I/O - Simplify get_path_model/get_path_run to always resolve via _get_shared_wg_path() - Change _get_shared_wg_path() to cached, argument-free helper returning the shared working dir from private config - Adjust model config save/load to build filenames relative to the run’s model directory instead of passing parent paths around - Update load_run_config and load_merge_configs to use new path helpers and improve assertion/log messages - Replace internal _get_shared_wg_path("results") usages with get_path_run() in wegen_reader and train_logger * fixed base_path in metrics_path * fixed forgotten config.general * fixed lint raised issues * Improve path handling and add missing docstrings - Add docstrings to 10+ utility functions for better documentation - Refactor load_run_config to improve path construction logic - Move mini_epoch string formatting from _get_model_config_file_read_name to caller for better separation of concerns - Add validation for mini_epoch_str format with descriptive error messages - Fix multi-line docstring format in _load_private_conf * fixed line too long * reverting to previous _get_model_config_file_read_name() * pretty fix for _get_model_config_file_read_name * pretty fix for _get_model_config_file_read_name * removed unused/undefined path

* replace '_' with '-' * cli options underscore to dash * change underscores to hyphens * rename options in cli unit test

Co-authored-by: Savvas Melidonis <[email protected]>

* rename write_num_samples to num_samples * Fixing linting --------- Co-authored-by: Christian Lessig <[email protected]>

* remove misleading logging of mini_epoch * add forecast_steps logging

* Fix duplicate run_id in results and runplots paths. Linting. * remove duplicate run_id also from metrics directory * Linting

iluise and others added 30 commits January 16, 2026 16:56

latent_space evaluation scripts + propagate verbose

c5c1a05

lint

a4b0f12

add usage

40eccbd

fix log

3ed01fc

Jk/develop/1639 fix shard val forward (#1642)

2f9f125

* rm model_forward assignment in val * rm clutter from diffusion branch * reverse if order

Clessig/develop/fix finetuning 1640 (#1641)

7c4bb82

* Fix bug with diagnostic streams * Avoid that empty decoders are allocated

Sophiex/dev/synop nppatms finetuning configs (#1644)

9144c64

* Doing something wrong * Make fine-tuning work * Rename sensibly

Enable multiple student views for one target for JEPA (#1617)

699a8aa

* Enable multiple student views for one target * Improved readability

Fix test for empty targets in decoder creation (#1646)

88e809d

add regions to integration tests (#1648)

9bdd7d0

Allows for writing normalized samples; fixed config to keep it well-s…

15a8c29

…tructured (#1653)

add default streams + fix lead time error (#1670)

c054552

* add default streams + fix lead time error * update config * Correct a bug creating aggr issues on scores (#1685) --------- Co-authored-by: Savvas Melidonis <[email protected]>

Update normlise output flag (#1681)

3aeb324

slurm script inference (#1675)

25948f3

* add argument * check stage argument * removed unnecessary code * arbitrary position arguments * Fix error text * get stage info from environment variable. * Update run_train.py --------- Co-authored-by: Simon Grasse <[email protected]>

[infra] consistent cli options (#1668)

b37706c

* replace '_' with '-' * cli options underscore to dash * change underscores to hyphens * rename options in cli unit test

Fix bug for missing run_id path in model path (#1704)

9710f81

fix bar plot (#1698)

fa952ff

Co-authored-by: Savvas Melidonis <[email protected]>

Fix output generation during inference (#1707)

34fa89a

* rename write_num_samples to num_samples * Fixing linting --------- Co-authored-by: Christian Lessig <[email protected]>

backwards compatilble run_id look up (#1715)

d368400

remove misleading logging of mini_epoch (#1679)

e84d8d8

* remove misleading logging of mini_epoch * add forecast_steps logging

Fix duplicate run_id in results and runplots paths (#1716)

cfcad2e

* Fix duplicate run_id in results and runplots paths. Linting. * remove duplicate run_id also from metrics directory * Linting

latent_space evaluation scripts + propagate verbose

16f790d

lint

d0701eb

update latent space eval

49601ff

rebase to develop

9824a54

iluise added 2 commits January 27, 2026 20:39

final version

9518cf5

add readme

4e06d00

iluise self-assigned this Jan 29, 2026

iluise added the eval anything related to the model evaluation pipeline label Jan 29, 2026

iluise added this to WeatherGen-dev Jan 29, 2026

iluise added 8 commits January 29, 2026 15:01

Merge branch 'develop' into iluise/develop/eval-latent-space

a24bbde

Update default_config.yml

9a713a8

Update default_forecast_config.yml

a097b3c

Update default_forecast_config.yml

2744792

Update csv_reader.py

74760c8

Update wegen_reader.py

668bd2c

Update plot_utils.py

594efb1

Update utils.py

725e145

iluise changed the title ~~Iluise/develop/eval latent space~~ implement evaluation routine for SSL Jan 29, 2026

iluise and others added 6 commits January 29, 2026 15:14

Update plotter.py

04a4872

lint

a984045

fix verbose

139a2e2

rename to ssl analysis

85bc1b2

lint

1c436bc

Merge branch 'develop' into iluise/develop/eval-latent-space

71ce8e4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement evaluation routine for SSL #1747

implement evaluation routine for SSL #1747

iluise commented Jan 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

implement evaluation routine for SSL #1747

Are you sure you want to change the base?

implement evaluation routine for SSL #1747

Conversation

iluise commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issue Number

Checklist before asking for review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

iluise commented Jan 29, 2026 •

edited

Loading