You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Versioning issues can cause error message of the type ```undefined symbol```
1000
-
and such. For these, refer to the [versioning issues document](https://github.com/pytorch/rl/blob/main/knowledge_base/VERSIONING_ISSUES.md)
1001
-
for a complete explanation and proposed workarounds.
1002
-
1003
-
## Asking a question
1004
-
1005
-
If you spot a bug in the library, please raise an issue in this repo.
1006
861
1007
-
If you have a more generic question regarding RL in PyTorch, post it on
1008
-
the [PyTorch forum](https://discuss.pytorch.org/c/reinforcement-learning/6).
862
+
### Key Parameters
1009
863
1010
-
## Contributing
864
+
-`--dataset`: Currently supports 'gsm8k' and 'ifeval'
865
+
-`--model_name`: Any HuggingFace model name
866
+
-`--num_envs`: Number of parallel environments
867
+
-`--steps_per_batch`: Steps to collect per batch
868
+
-`--optim_batch_size`: Batch size for optimization
869
+
-`--epochs`: Number of epochs per batch collection
870
+
-`--repeats`: Number of action repeats for GRPO
871
+
-`--lr`: Learning rate
872
+
-`--kl_coef`: KL penalty coefficient
873
+
-`--compile`: Enable torch.compile() for the loss function
874
+
-`--clip_grad_norm`: Gradient norm clipping value
875
+
-`--gpu_memory_utilization`: GPU memory utilization for vLLM
1011
876
1012
-
Internal collaborations to torchrl are welcome! Feel free to fork, submit issues and PRs.
1013
-
You can checkout the detailed contribution guide [here](https://github.com/pytorch/rl/blob/main/CONTRIBUTING.md).
1014
-
As mentioned above, a list of open contributions can be found in [here](https://github.com/pytorch/rl/issues/509).
877
+
## Hardware Requirements
1015
878
1016
-
Contributors are recommended to install [pre-commit hooks](https://pre-commit.com/) (using `pre-commit install`). pre-commit will check for linting related issues when the code is committed locally. You can disable th check by appending `-n` to your commit command: `git commit -m <commit message> -n`
879
+
- CUDA-capable GPU with at least 8GB VRAM
880
+
- For multi-GPU setups, the script automatically manages device allocation
1017
881
882
+
## Monitoring
1018
883
1019
-
## Disclaimer
884
+
The training progress is logged to Weights & Biases. Key metrics include:
885
+
- Reward
886
+
- Advantage
887
+
- KL penalty
888
+
- Sequence length
889
+
- Loss metrics (ESS, objective, clip fraction, etc.)
1020
890
1021
-
This library is released as a PyTorch beta feature.
1022
-
BC-breaking changes are likely to happen but they will be introduced with a deprecation
1023
-
warranty after a few release cycles.
891
+
## License
1024
892
1025
-
# License
1026
-
TorchRL is licensed under the MIT License. See [LICENSE](https://github.com/pytorch/rl/blob/main/LICENSE) for details.
893
+
This project is licensed under the MIT License - see the LICENSE file for details.
0 commit comments