-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in offroad_gator.py? #14
Comments
Yes, we have noticed the same issue with gator demo. The memory leak issue is spawned from Project Chrono and not gym-chrono specific. The issue only seems to happen when chrono::sensor is enabled, indicating memory leak with chrono::sensor. We are investigating on potential fix. For now, automated relaunching and reloading the neural network might be the only solution. |
Thanks @zzhou292 |
@pgupta2050 So in short, you will still face leaks even if you don't import Chrono::Sensor, albeit smaller ones. For next steps, as @zzhou292 suggested, our current solution is baby sitting the RL runs - Save checkpoints and reload every time we are close to the memory limit. @StefanCaldararu made a script for that - mind adding it here? With regards to the fix for the memory leak - This will require a lot of work and I don't think we will be able to pull it off any time soon. Additionally, I don't think we can guarantee a complete fix since we have noticed that a lot of the Chrono::Sensor leaks come from the 3rd party ray tracing library we use, Optix. |
OK, got it. Thanks for checking it out at your end! |
@pgupta2050 sorry I was locked out of my account for a bit, the script I wrote to avoid some of these issues can be found here You also need to modify the training script to take the loaded checkpoint as an argument, as done here All it does is fully reset the environment / process being run every few checkpoints. |
Hello,
I noticed that if I use vehicle models with the ChronoBaseEnv(), there is a memory leak in the vectorized environments. I have noticed this for the offroad_gator.py and an env I created using the hmmwv vehicle. The individual environments keep growing their memory usage. It seems as if some environment resources are not being released at
env.reset()
. The other env models such ascobra_wpts.py
dont seem to leak memory, however. Is this a known issue? Does anyone have thoughts about debugging this? Maybe something to do with thegym.Env.close()
implemention?I am using
top
utility to monitor the memory usage and the/usr/bin/python3 -c from multiprocessing.forkserver ...
nodes are the ones that grow in memory usage until the system runs out of memory and crashes. An example of where I monitor this (in this case, just re-ran cobra_wpts_train.py to reproduce results) :My training and system info:
Thanks.
The text was updated successfully, but these errors were encountered: