-
Notifications
You must be signed in to change notification settings - Fork 616
grpo训练过程张量不匹配 #3645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Does the main branch have this issue? |
是的 请问您知道怎么解决这个问题吗,比较急 我检查了我的数据集应该没问题 。自定义的奖励函数 不知道如何排查 |
Is there an issue with the provided reward function? |
我们可以邮件沟通一下吗请问 |
You can join the WeChat discussion group: #3076 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
RuntimeError: The expanded size of the tensor (1) must match the existing size (0) at non-singleton dimension 0. Target sizes: [1]. Tensor sizes: [0]想请问在grpo训练过程中出现这个报错,应该怎么解决
The text was updated successfully, but these errors were encountered: