grpo训练过程张量不匹配 #3645

xdlzr · 2025-03-25T03:08:20Z

RuntimeError: The expanded size of the tensor (1) must match the existing size (0) at non-singleton dimension 0. Target sizes: [1]. Tensor sizes: [0]想请问在grpo训练过程中出现这个报错，应该怎么解决

hjh0119 · 2025-03-25T03:38:17Z

Does the main branch have this issue?

xdlzr · 2025-03-25T03:46:39Z

是的请问您知道怎么解决这个问题吗，比较急我检查了我的数据集应该没问题。自定义的奖励函数不知道如何排查

hjh0119 · 2025-03-25T05:46:10Z

Is there an issue with the provided reward function?

xdlzr · 2025-03-25T06:01:21Z

我们可以邮件沟通一下吗请问

hjh0119 · 2025-03-25T06:16:02Z

You can join the WeChat discussion group: #3076

hjh0119 closed this as completed Mar 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

grpo训练过程张量不匹配 #3645

grpo训练过程张量不匹配 #3645

xdlzr commented Mar 25, 2025

hjh0119 commented Mar 25, 2025

xdlzr commented Mar 25, 2025

hjh0119 commented Mar 25, 2025

xdlzr commented Mar 25, 2025

hjh0119 commented Mar 25, 2025

grpo训练过程张量不匹配 #3645

grpo训练过程张量不匹配 #3645

Comments

xdlzr commented Mar 25, 2025

hjh0119 commented Mar 25, 2025

xdlzr commented Mar 25, 2025

hjh0119 commented Mar 25, 2025

xdlzr commented Mar 25, 2025

hjh0119 commented Mar 25, 2025