Description
Hi, When I try to train this code with python main.py --train in my local computer. It's always happen that the cpu memory keeps increasing and when it is full, an error is reported. I think this problem cause by the memory out.
Traceback (most recent call last):
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/user/nzh_projects/kaggle-environments-master/kaggle_agent/HandyRL/handyrl/connection.py", line 190, in _receiver
data, cnt = conn.recv()
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
Exception in thread Thread-4:
Traceback (most recent call last):
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/user/nzh_projects/kaggle-environments-master/kaggle_agent/HandyRL/handyrl/connection.py", line 175, in _sender
conn.send(next(self.send_generator))
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/multiprocessing/connection.py", line 397, in _send_bytes
self._send(header)
File "/home/user/.conda/envs/spinningup_nzh/lib/python3.6/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Because my Memory only have 32G, so i have try to turn the max_episodes to the 200000. when the number of the sample at 20w, the memoey use_rate about 95%, but this code still occupy the more cpu memory with continuous sampling. In my Option, I think when the sample number more than max_episodes, the replay buffer will delete the old sample. but i dont know why the memory still increase.
could you give me some ideas. and how can i solve this problem(not restart). Thanks very much!