Skip to content

Conversation

@Gamenot
Copy link
Contributor

@Gamenot Gamenot commented May 27, 2025

Previous implementation selected target objects from only the last sub-scene (-1) causing scene cross-contamination if information from target_object was used. Original RL behavior after implementing the rewards generally resulted in reaching to a constant location on parallel environments due to referencing objects from a single scene.

Improvements:

  • Use the batched RNG which scales already by num_envs.
  • Generate the RNG for target selection only once for all sub-scenes.
  • Select the target object for each sub-scene (i) from each sub-scene (i).

Previous implementation selected from only the last sub-scene causing scene cross contamination if `target_object` was used.
@StoneT2000
Copy link
Member

Thank you! I will try and review this early next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants