Skip to content

Conversation

@vvvdwbvvv
Copy link
Contributor

Summary

This PR adds support for GLM4.5 (GLM-4 MOE) models to the Liger Kernel #951
https://huggingface.co/zai-org/GLM-4.5 which share the same structure as GLM 4.6

Testing Done

For the convergence test on fp32, model size can easily leads to OOM, initially I was using 4090 to run the tests, however only fp32 encounters OOM, so I move forward to L40S to finish all the tests.

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

skip_logits = self.training and (labels is not None or shift_labels is not None)

if skip_logits:
loss = LigerForCausalLMLoss(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kindly have a look at the other model examples and adapt to new API that returns the metric

@vvvdwbvvv
Copy link
Contributor Author

vvvdwbvvv commented Nov 25, 2025

Fixed in 5af9d16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants