Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel dies with GTX 1050 4GB #95

Open
loretoparisi opened this issue Jan 4, 2022 · 0 comments
Open

Kernel dies with GTX 1050 4GB #95

loretoparisi opened this issue Jan 4, 2022 · 0 comments

Comments

@loretoparisi
Copy link

loretoparisi commented Jan 4, 2022

My Jupyeter notebook kernel dies (The kernel appears to have died. It will restart automatically.)when trying to load the main model after downloading it:

device = 'cuda'
dalle = get_rudalle_model('Malevich', pretrained=True, fp16=True, device=device, cache_dir='./')

I have split cells for the vae, tokenizer and clip that all load fine. My nvidia-smi is the following:

Total GPU RAM: 3.94 Gb
CPU: 4
RAM GB: 7.8
PyTorch version: 1.10.1+cu102
CUDA version: 10.2
cuDNN version: 7605
Allowed GPU RAM: 3.5 Gb
GPU part 0.8886
Tue Jan  4 18:22:08 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:01:00.0  On |                  N/A |
| 45%   25C    P0    N/A /  75W |    849MiB /  4033MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1104      G   /usr/lib/xorg/Xorg                 84MiB |
|    0   N/A  N/A      1682      G   /usr/bin/gnome-shell               31MiB |
|    0   N/A  N/A     12024      G   ...AAAAAAAA== --shared-files       38MiB |
|    0   N/A  N/A     13204      C   /usr/bin/python                   689MiB |
+-----------------------------------------------------------------------------+

while system mem is

loreto@ombromanto:~/Projects/notebooks/rudalle$ free -h
              total        used        free      shared  buff/cache   available
Mem:           7,8G        3,8G        2,8G         97M        1,1G        3,6G
Swap:          2,0G        993M        1,0G

while cpu unit is


loreto@ombromanto:~/Projects/notebooks/rudalle$ cat /proc/cpuinfo  | grep 'name'| uniq
model name	: Intel(R) Core(TM)2 Quad  CPU   Q9550  @ 2.83GHz

With this configuration I'm able to load models like CLIP, GLIDE, LAMA, etc with minor limitations.

I have also tried to follow this approach:

device = 'cpu'
dalle = get_rudalle_model('Malevich', pretrained=True, fp16=False, device=device, cache_dir='./')
if has_cuda:
     device = 'cuda'
     dalle.to(device)

loading the model in cpu and moving to cuda, but still getting the notebook issue:

[D 18:22:25.471 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: status (busy)
[D 18:22:25.476 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: execute_input
[D 18:22:25.477 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: status (idle)
[D 18:22:30.023 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: status (busy)
[D 18:22:30.024 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: execute_input
[I 18:23:41.356 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
kernel 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05 restarted
[D 18:23:41.831 NotebookApp] Starting kernel: ['/usr/bin/python', '-m', 'ipykernel_launcher', '-f', '/home/loreto/.local/share/jupyter/runtime/kernel-464591fd-7e62-4cd7-80e8-0ac4f3f9ac05.json']
[D 18:23:42.303 NotebookApp] Connecting to: tcp://127.0.0.1:36147
[D 18:23:44.736 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: status (starting)
[D 18:23:44.759 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: status (busy)
[D 18:23:44.761 NotebookApp] activity on 464591fd-7e62-4cd7-80e8-0ac4f3f9ac05: status (idle)
[D 18:23:45.040 NotebookApp] 200 GET /static/base/images/favicon-notebook.ico (127.0.0.1) 122.080000ms
[D 18:23:46.533 NotebookApp] 200 GET /api/contents/rudalle/Malevich_3_5GB_vRAM_usage.ipynb?content=0&_=1641316902647 (127.0.0.1) 19.390000ms
[D 18:23:54.294 NotebookApp] KernelRestarter: restart apparently succeeded

Of course in this case it would be necessary to convert to FP16 doing like dalle.convert_to_fp16() but I'm not sure how to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant