This project is used to configure a Reinforcement Learning Docker environment based on isaac_gym.
Using Docker allows for the rapid deployment of isolated, virtual, and identical development environments, eliminating the situation of "it runs on my computer, but not on yours."
git clone https://github.com/fan-ziqi/rl_docker.git
cd rl_docker
Copy requirement_template.txt
and rename it to requirement.txt
. In this file, add the necessary Python dependencies. (Dependencies added in this file will be downloaded during Docker build and will not be re-downloaded after the container is generated.)
cp -p requirement_template.txt requirement.txt
Copy setup_template.sh
and rename it to setup.sh
. In this file, configure all Python packages. (Dependencies added in this file will be re-downloaded every time the Docker container is run, only to address specific dependency conflicts. Unless there are special circumstances, include all dependencies in requirement.txt
.)
cp -p setup_template.sh setup.sh
For setup_template.sh
, the corresponding working directory file hierarchy is as follows:
rl_ws/
│
├── rl_docker/
│ └── ...
│
├── isaacgym/
│ ├── python/
│ │ ├── setup.py
│ │ └── ...
│ │
│ └── ...
│
├── rsl_rl/
│ ├── setup.py
│ └── ...
│
├── legged_gym/
│ ├── setup.py
│ └── ...
│
└── ...
bash build.sh
bash run.sh -g <gpus, should be num 1~9 or all> -d <true/false>
# example: bash run.sh -g all -d true
These two newly created files will not be tracked by Git. If needed, please modify.gitignore
.
Use Ctrl+P+Q
to exit the current terminal and use exit
to stop the container.
The image comes with nvitop
installed. Open a new window, run bash exec.sh
to enter the container, and use nvitop
to view the system resource usage. Use exit
or Ctrl+P+Q
to exit the current terminal without stopping the container.
-
If you are using an RTX 4090 GPU, modify the first line of the
docker/Dockerfile
file to:nvcr.io/nvidia/pytorch:22.12-py3
-
If you are using an RTX 3070 GPU, no modifications are needed.
Please find the supported versions of pytorch for other GPUs in the link below
If you encounter the following error when running the run.sh
script:
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
This error is mostly due to not running the container with root privileges. Here are a few solutions:
- Prefix the bash command with
root
. - Switch to the root user.
- Add the current user to the root group.
If you cannot find the pre-built isaacgym image, you need to rebuild the image with root permissions.
If you encounter the following error when running the run.sh
script:
docker: Error response from daemon: could not select device driver "" with capabilities:[[gpu]].
You need to install the nvidia-container-runtime
and nvidia-container-toolkit
packages, and modify the Docker daemon startup parameter to change the default runtime to nvidia-container-runtime
:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install nvidia-container-toolkit nvidia-container-runtime
sudo vi /etc/docker/daemon.json
Update the content to:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Reload docker service:
systemctl restart docker