-
-
Notifications
You must be signed in to change notification settings - Fork 173
Open
Labels
Description
Issue description
Hi, thank you for your great work and for sharing the implementation of TimeMixer++!
While reading the code, I noticed a possible discrepancy between the paper and the current implementation of the forecast function in timemixerpp/backbone.py
In the paper, the output is defined as:
$$\text{output} = \text{Ensemble}([ \text{Head}_m(x^L_m) ]_{m=0}^M)$$
That is, the final output should be an ensemble (e.g., averaging or weighted sum) of all scale-specific predictions.
However, in the current code:
dec_out_list = []
for i, enc_out in zip(range(len(x_list)), enc_out_list):
dec_out = self.predict_layers[i](enc_out.permute(0, 2, 1)).permute(0, 2, 1)
dec_out = self.projection_layer(dec_out)
dec_out = dec_out.reshape(B, self.n_pred_features, -1).permute(0, 2, 1).contiguous()
dec_out_list.append(dec_out)
dec_out = self.revin_layers[0](dec_out, mode="denorm") if self.use_norm else dec_out
return dec_out
It seems that only the last scale’s output (dec_out) is used, and the ensemble operation across all scales (dec_out_list) is missing.
Could you please clarify:
- Was this simplification intentional (e.g., due to negligible performance gain from ensemble)?
- Or should the implementation actually ensemble all dec_out_list items (e.g., mean or weighted sum) to match the paper description?
Thanks again for the great work and detailed codebase!
Your contribution
Already starred