[API] M layer design #1870

fkiraly · 2025-06-02T21:11:43Z

Issue to discuss the M layer design. For context, see design document here: sktime/enhancement-proposals#39

My current proposed design, this is base on v1 and v2 metadata layer design. Long-term state:

starts off the current "metadata" class, but named like the model, e.g., TFT. The current ligthning network is renamed TFT_NN
has tags and get_test_params etc similar to current "metadata" class
__init__ has all args of two objects: the loader (D2, e.g., DecoderEncoderModule) and the network (e.g., TFT_NN). minus data
method get_loader_class gets the loader class (e.g., the class DecoderEncoderModule); get_loader(data: TimeSeries) produces an loader object, an instance of the get_loader_class return.
method get_nn_class returns the nn class (e.g., TFT_NN); get_nn(loader) gets an instance of the nn class.
finally, there is a method init(data), which calls the above in sequence, and produces a pair of loader and nn, as if the two get methods were called in sequence.
__call__ dispatches to init

So, a usage vignette could look like:

from lightning.pytorch import Trainer
from pytorch_forecasting import TimeSeries
from pytorch_forecasting.models import TFT

dataset = TimeSeries(...)
model_cfg = TFT(
    max_encoder_length=30,
    max_prediction_length=1,
    batch_size=32,
    loss=nn.MSELoss(),
    logging_metrics=[MAE(), SMAPE()],
    optimizer="adam",
    hidden_size=64,
    num_layers=2,
    attention_head_size=4,
)

net, loader = model_cfg(dataset)

trainer = Trainer(
    max_epochs=5,
    accelerator="auto",
    devices=1,
    enable_progress_bar=True,
    log_every_n_steps=10,
)

trainer.fit(net, loader)

etc

The only thing that changes for other models are the model class, and the args/values of it, for model_cfg.

In sktime, we would add the trainer as an arg to __init__, and sktime fit(data) does self.trainer(*self.model_cfg(data)) (with some potential conversion for data - or we could allow TimeSeries as an mtype)

The text was updated successfully, but these errors were encountered:

fkiraly · 2025-06-02T21:12:07Z

FYI @agobbifbk, @PranavBhatP , @phoeenniixx , @xandie985

agobbifbk · 2025-06-03T07:17:36Z

Nice!!!
Some comments:

logging_metrics=[MAE(), SMAPE()], in my opinion should be managed by the Trainer with the argument callbacks=...
with this design, are we loosing the flow 'D1->D2->M'? It seems that 'M' and 'D2' are create in the same function, can this cause some problem to the user? We are mixing model parameters, loss and optimizer in the same place
Maybe we can think to a intermediate solution where, instead of having one metadata class per model we have a class for a class of models. For example Encoder-Decoder models all needs some information: past and future steps, number of channels, embedding dimension (categorical variables), output size and so one, that are collected from the metadata of the d2 layer.
Your approach can be adapted but what I'm saying is that TFT can be something more general and reused by different models, where we clearly see which are the model parameters and what are the parameters derived from the D2 layer

Maybe I'm not so clear, we can discuss it further in a dedicated meeting, my feeling is that we can optimize even more the structure keeping clear the flow D1->D2->M.

fkiraly · 2025-06-03T13:49:27Z

with this design, are we loosing the flow 'D1->D2->M'?

Well, my idea always was that the user sided flow optimally should only be "data -> model", and that interface being unified, same as in sklearn. That is, the flow is D1 -> D2 -> NN, and M = D2 + NN

The challenge was that we had to cut the data layer in half to achieve unified.

fkiraly · 2025-06-03T13:51:08Z

Maybe we can think to a intermediate solution where, instead of having one metadata class per model we have a class for a class of models.

Agreed, though I would make this an additional intermediate base class. Users typically want "model" (including loader) as an object.

Your approach can be adapted but what I'm saying is that TFT can be something more general and reused by different models, where we clearly see which are the model parameters and what are the parameters derived from the D2 layer

Can you give examples of "reuse"?

fkiraly added enhancement New feature or request API design API design & software architecture labels Jun 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[API] M layer design #1870

[API] M layer design #1870

fkiraly commented Jun 2, 2025 •

edited

Loading

fkiraly commented Jun 2, 2025 •

edited

Loading

Uh oh!

agobbifbk commented Jun 3, 2025

Uh oh!

fkiraly commented Jun 3, 2025 •

edited

Loading

Uh oh!

fkiraly commented Jun 3, 2025 •

edited

Loading

Uh oh!

[API] M layer design #1870

[API] M layer design #1870

Comments

fkiraly commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

fkiraly commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agobbifbk commented Jun 3, 2025

Uh oh!

fkiraly commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fkiraly commented Jun 2, 2025 •

edited

Loading

fkiraly commented Jun 2, 2025 •

edited

Loading

fkiraly commented Jun 3, 2025 •

edited

Loading

fkiraly commented Jun 3, 2025 •

edited

Loading