-
Notifications
You must be signed in to change notification settings - Fork 65
Open
Description
I am using Koila to solve an OOM error during my training. But the following error occurs :
``Traceback (most recent call last):
File "/mnt/sdb2/Adama/configure_docker_for_transvw/pytorch/train.py", line 92, in
loss.backward()
File "/home/nanaa/.local/lib/python3.10/site-packages/koila/lazy.py", line 435, in backward
for mini_batch_size in gpus.split_batch(
File "/home/nanaa/.local/lib/python3.10/site-packages/koila/gpus.py", line 100, in split_batch
batch_size = 2 ** (math.floor(math.log2(max_batch)))
ValueError: math domain error```
Probably due to the value of max_batch ?
Metadata
Metadata
Assignees
Labels
No labels