OOM GPU with en_core_web_trf #12305
Unanswered
jogonba2
asked this question in
Help: Other Questions
Replies: 2 comments 3 replies
-
Hi @jogonba2! Can you try adding a |
Beta Was this translation helpful? Give feedback.
1 reply
-
Note also that it looks like you're loading all of your data in memory in this loop. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
I'm trying to train the en_core_web_trf model for NER with custom labels on a GPU with 48GB of memory. Doing some tests with a small sample of 50 examples, 16 labels, and batches of 8 examples, the GPU memory is exhausted. The length of the sequences is around 100 tokens with some long docs of more than 3000 tokens. OS is Ubuntu 22.04 with CUDA 11.6. This is my training code:
I can't find what is the cause of such memory consumption. After truncating all the examples to 200 characters and doing
torch.cuda.empty_cache()
after each batch, the memory is around 10GB. Is this expected? seems a bit large resource usage.Thanks!
Beta Was this translation helpful? Give feedback.
All reactions