-
Notifications
You must be signed in to change notification settings - Fork 78
Description
Is your feature request related to a problem? Please describe.
When training with multiple datasets, it is currently not possible to configure different graphs per dataset through the standard configuration workflow. The encoder/decoder subgraph configuration is global, which limits flexibility when datasets require different graph structures.
As a workaround, graphs can be pre-generated externally and loaded during training, but this introduces additional complexity.
Describe the solution you'd like
Add native support for per-dataset encoder and decoder subgraphs configuration in multi-dataset training.
In addition, refactor the current workflow so that the training pipeline constructs and operates on a single graph instance (HeteroData) rather than a dictionary of graphs per dataset. The per-dataset configuration should be resolved during dataset initialisation, producing a unified graph representation compatible with the existing model and training interfaces.
Additional context
This functionality is important when combining datasets with different spatial structures or connectivity requirements (e.g. stretched/LAM with regional observations)
Organisation
ECMWF
Metadata
Metadata
Assignees
Labels
Type
Projects
Status