Skip to content

Commit a6d2c96

Browse files
committed
Update
[ghstack-poisoned]
1 parent 315f9f4 commit a6d2c96

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

torchrl/envs/transforms/llm.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -765,7 +765,7 @@ def _step(
765765
kl = curr_log_prob - log_prob
766766
if reward is None:
767767
reward = 0
768-
next_tensordict.set(self.out_keys[0], reward + self.coef * kl)
768+
next_tensordict.set(self.out_keys[0], reward - self.coef * kl)
769769
return next_tensordict
770770

771771
def forward(self, tensordict: TensorDictBase) -> TensorDictBase:

0 commit comments

Comments
 (0)