Open
Description
Describe the bug
I use a GPU powered Linux laptop and I couldn't successfully run the docker compose scenario.
Here are my prerequisites:
# docker is installed
~ docker --version
Docker version 26.1.4, build 5650f9b
# so is the docker compose plugin
~ docker compose version
Docker Compose version v2.27.1
# nvidia container toolkit is intalled
~ dpkg -l | grep nvidia-container-toolkit
ii nvidia-container-toolkit 1.12.1-0pop1~1679409890~22.04~5f4b1f2 amd64 NVIDIA Container toolkit
ii nvidia-container-toolkit-base 1.12.1-0pop1~1679409890~22.04~5f4b1f2 amd64 NVIDIA Container Toolkit Base
# and configured
~ cat /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}%
# nvidia container toolkit sample workload working
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
9c704ecd0c69: Pull complete
Digest: sha256:2e863c44b718727c860746568e1d54afd13b2fa71b160f5cd9058fc436217b30
Status: Downloaded newer image for ubuntu:latest
Thu Jun 20 02:37:30 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4080 ... Off | 00000000:02:00.0 Off | N/A |
| N/A 46C P8 4W / 150W | 122MiB / 12282MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
# ML compatible GPU is availble
nvidia-smi
Wed Jun 19 21:13:15 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4080 ... Off | 00000000:02:00.0 Off | N/A |
| N/A 51C P8 6W / 150W | 122MiB / 12282MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3440 G /usr/lib/xorg/Xorg 18MiB |
| 0 N/A N/A 9091 C+G warp-terminal 91MiB |
+-----------------------------------------------------------------------------------------+
# made the required GPU related changes in `docker-compose.yml`
diff --git a/docker/docker-compose.yml b/docker/docker-compose.yml
index d8893d9..8ca7305 100644
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -1,14 +1,16 @@
+# Copied from https://github.com/ageron/handson-ml3/blob/main/docker/docker-compose.yml
+# Modification instructions copied from https://github.com/ageron/handson-ml3/tree/main/docker#prerequisites-1
version: "3"
services:
handson-ml3:
build:
context: ../
- dockerfile: ./docker/Dockerfile #Dockerfile.gpu
+ dockerfile: ./docker/Dockerfile.gpu
args:
- username=devel
- userid=1000
container_name: handson-ml3
- image: ageron/handson-ml3:latest #latest-gpu
+ image: ageron/handson-ml3:latest-gpu
restart: unless-stopped
logging:
driver: json-file
@@ -20,8 +22,8 @@ services:
volumes:
- ../:/home/devel/handson-ml3
command: /opt/conda/envs/homl3/bin/jupyter lab --ip='0.0.0.0' --port=8888 --no-browser
- #deploy:
- # resources:
- # reservations:
- # devices:
- # - capabilities: [gpu]
+ deploy:
+ resources:
+ reservations:
+ devices:
+ - capabilities: [gpu]
\ No newline at end of file
To Reproduce
- Use a
POP!_OS
orUbuntu
22.04 LTS
- Install the prerequisites
- Download ml3 code repository
- Make the required changes
- Run
docker compose up
fromdocker
directory
Here is the output:
docker compose up
WARN[0000] /home/vasilegorcinschi/repos/handson-ml3/docker/docker-compose.yml: `version` is obsolete
Attaching to handson-ml3
Gracefully stopping... (press Ctrl+C again to force)
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/opt/conda/envs/homl3/bin/jupyter": stat /opt/conda/envs/homl3/bin/jupyter: no such file or directory: unknown
Expected behavior
The docker container to start
Versions (please complete the following information):
- OS: POP!_OS 22.04 LTS
- Python: 3.10.12
Metadata
Metadata
Assignees
Labels
No labels