Open
Description
Hi, thank you for releasing the code.
I have a problem on the computation of the reward. In the compute_reward
function in videogpt_reward_model.py
, for each transition image_batch
, encodings
, and embeddings
correspond to reward_model_compute_joint
is set to False
) and the sum from reward_model_compute_joint
is set to True
), instead of the
Metadata
Metadata
Assignees
Labels
No labels