See https://github.com/pytorch-labs/LeanRL. I think there is room for making our implementations faster by implementing these PyTorch tricks.