Skip to content

[BUG] Unable to run docker compose following the instructions: /opt/conda/envs/homl3/bin/jupyter directory missing #142

Open
@vasigorc

Description

@vasigorc

Describe the bug
I use a GPU powered Linux laptop and I couldn't successfully run the docker compose scenario.

Here are my prerequisites:

# docker is installed
~ docker --version
Docker version 26.1.4, build 5650f9b

# so is the docker compose plugin
~ docker compose version
Docker Compose version v2.27.1

# nvidia container toolkit is intalled
~ dpkg -l | grep nvidia-container-toolkit
ii  nvidia-container-toolkit                          1.12.1-0pop1~1679409890~22.04~5f4b1f2                             amd64        NVIDIA Container toolkit
ii  nvidia-container-toolkit-base                     1.12.1-0pop1~1679409890~22.04~5f4b1f2                             amd64        NVIDIA Container Toolkit Base

# and configured
~ cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}%  

# nvidia container toolkit sample workload working
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
9c704ecd0c69: Pull complete
Digest: sha256:2e863c44b718727c860746568e1d54afd13b2fa71b160f5cd9058fc436217b30
Status: Downloaded newer image for ubuntu:latest
Thu Jun 20 02:37:30 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4080 ...    Off |   00000000:02:00.0 Off |                  N/A |
| N/A   46C    P8              4W /  150W |     122MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

# ML compatible GPU is availble
nvidia-smi
Wed Jun 19 21:13:15 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4080 ...    Off |   00000000:02:00.0 Off |                  N/A |
| N/A   51C    P8              6W /  150W |     122MiB /  12282MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3440      G   /usr/lib/xorg/Xorg                             18MiB |
|    0   N/A  N/A      9091    C+G   warp-terminal                                  91MiB |
+-----------------------------------------------------------------------------------------+

# made the required GPU related changes in `docker-compose.yml`
diff --git a/docker/docker-compose.yml b/docker/docker-compose.yml
index d8893d9..8ca7305 100644
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -1,14 +1,16 @@
+# Copied from https://github.com/ageron/handson-ml3/blob/main/docker/docker-compose.yml
+# Modification instructions copied from https://github.com/ageron/handson-ml3/tree/main/docker#prerequisites-1
 version: "3"
 services:
   handson-ml3:
     build:
       context: ../
-      dockerfile: ./docker/Dockerfile #Dockerfile.gpu
+      dockerfile: ./docker/Dockerfile.gpu
       args:
         - username=devel
         - userid=1000
     container_name: handson-ml3
-    image: ageron/handson-ml3:latest #latest-gpu
+    image: ageron/handson-ml3:latest-gpu
     restart: unless-stopped
     logging:
       driver: json-file
@@ -20,8 +22,8 @@ services:
     volumes:
       - ../:/home/devel/handson-ml3
     command: /opt/conda/envs/homl3/bin/jupyter lab --ip='0.0.0.0' --port=8888 --no-browser
-    #deploy:
-    #  resources:
-    #    reservations:
-    #      devices:
-    #      - capabilities: [gpu]
+    deploy:
+     resources:
+       reservations:
+         devices:
+         - capabilities: [gpu]
\ No newline at end of file

To Reproduce

  1. Use a POP!_OS or Ubuntu 22.04 LTS
  2. Install the prerequisites
  3. Download ml3 code repository
  4. Make the required changes
  5. Run docker compose up from docker directory

Here is the output:

docker compose up
WARN[0000] /home/vasilegorcinschi/repos/handson-ml3/docker/docker-compose.yml: `version` is obsolete
Attaching to handson-ml3
Gracefully stopping... (press Ctrl+C again to force)
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/opt/conda/envs/homl3/bin/jupyter": stat /opt/conda/envs/homl3/bin/jupyter: no such file or directory: unknown

Expected behavior
The docker container to start

Versions (please complete the following information):

  • OS: POP!_OS 22.04 LTS
  • Python: 3.10.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions