Skip to content

Commit 34d5f8b

Browse files
authored
Updates for PyTorch 2.0 (#94)
* Remove conda copying from the deployment stage. * Add extra utilities and help on sorting yaml requirements files. * Remove deprecated Caffe2 build flags. * Fix comments concerning the build process. * Update PyTorch default versions to v2.0.0 and TorchVision to 0.15.1. * Update ruff version. * Update build requirements for PyTorch 2.x even though this is a breaking change. The README was updated to mention this. * Remove FFMPEG flags from the TorchVision build process. * Add SymPy as a PyTorch runtime dependency. * Make deployment stage build target configurable. * Add documentation on how to specify build target stages and how to get the wheel files. * Reformat code. * Updated deployment MKL version to 2023. Installing from pip still does not work. * Change train build settings to compile build tests to match the train and deployment build configurations by default. * Fix formatting.
1 parent 7568722 commit 34d5f8b

File tree

9 files changed

+67
-57
lines changed

9 files changed

+67
-57
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ repos:
3232

3333
# Ruff should be executed before other formatters.
3434
- repo: https://github.com/charliermarsh/ruff-pre-commit
35-
rev: "v0.0.254"
35+
rev: "v0.0.256"
3636
hooks:
3737
- id: ruff
3838
args: [--exit-non-zero-on-fix]

Dockerfile

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -207,14 +207,12 @@ COPY --link --from=clone-torch /opt/pytorch /opt/pytorch
207207
# Read `setup.py` and `CMakeLists.txt` to find build flags.
208208
# Different flags are available for different versions of PyTorch.
209209
# Variables without default values here recieve defaults from the top of the Dockerfile.
210-
# Disabling Caffe2, NNPack, and QNNPack as they are legacy and most users do not need them.
210+
# Disabling NNPack and QNNPack by default as they are legacy and most users do not need them.
211211
ARG USE_CUDA
212212
ARG USE_CUDNN=${USE_CUDA}
213213
ARG USE_NNPACK=0
214214
ARG USE_QNNPACK=0
215215
ARG BUILD_TEST=0
216-
ARG BUILD_CAFFE2=0
217-
ARG BUILD_CAFFE2_OPS=0
218216
ARG USE_PRECOMPILED_HEADERS
219217
ARG TORCH_CUDA_ARCH_LIST
220218
ARG CMAKE_PREFIX_PATH=/opt/conda
@@ -291,9 +289,6 @@ RUN --mount=type=bind,from=build-pillow,source=/tmp/dist,target=/tmp/dist \
291289
python -m pip install --force-reinstall --no-deps /tmp/dist/*
292290

293291
ARG USE_CUDA
294-
# Disable FFMPEG and remove it as a build dependency if TorchVision
295-
# fails to compile with unhelpful error messages.
296-
ARG USE_FFMPEG=1
297292
ARG USE_PRECOMPILED_HEADERS
298293
ARG FORCE_CUDA=${USE_CUDA}
299294
ARG TORCH_CUDA_ARCH_LIST
@@ -532,7 +527,6 @@ COPY --link --from=fetch-vision /tmp/dist /tmp/dist
532527
########################################################################
533528
FROM ${BUILD_IMAGE} AS deploy-builds-include
534529

535-
COPY --link --from=install-conda /opt/conda /opt/conda
536530
COPY --link --from=build-pillow /tmp/dist /tmp/dist
537531
COPY --link --from=build-vision /tmp/dist /tmp/dist
538532

@@ -548,7 +542,7 @@ FROM deploy-builds-${BUILD_MODE} AS deploy-builds
548542

549543
# The Anaconda defaults channel and Intel MKL are not fully open-source.
550544
# Enterprise users may therefore wish to remove them from their final product.
551-
# The deployment therefore uses system Python. Conda is copied here just in case.
545+
# The deployment therefore uses system Python.
552546
# Intel packages such as MKL can be removed by using MKL_MODE=exclude during the build.
553547
# This may also be useful for non-Intel CPUs.
554548

README.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -119,24 +119,27 @@ IMAGE_NAME=cresset:train-USERNAME
119119
# [[Optional]]: Fill in these configurations manually if the defaults do not suffice.
120120
121121
# NVIDIA GPU Compute Capability (CCA) values may be found at https://developer.nvidia.com/cuda-gpus
122-
CCA=8.6 # Compute capability. CCA=8.6 for RTX3090 and A100.
123-
# CCA='8.6+PTX' # The '+PTX' enables forward compatibility. Multi-architecture builds can also be specified.
124-
# CCA='7.5 8.6+PTX' # Visit the documentation for details. https://pytorch.org/docs/stable/cpp_extension.html
122+
CCA=8.6 # Compute capability. CCA=8.6 for RTX3090 and A100.
123+
# CCA='8.6+PTX' # The '+PTX' enables forward compatibility. Multi-architecture builds can also be specified.
124+
# CCA='7.5 8.6+PTX' # Visit the documentation for details. https://pytorch.org/docs/stable/cpp_extension.html
125125
126126
# Used only if building PyTorch from source (`BUILD_MODE=include`).
127127
# The `*_TAG` variables are used only if `BUILD_MODE=include`. No effect otherwise.
128-
BUILD_MODE=exclude # Whether to build PyTorch from source.
129-
PYTORCH_VERSION_TAG=v1.13.1 # Any `git` branch or tag name can be used.
128+
BUILD_MODE=exclude # Whether to build PyTorch from source.
129+
PYTORCH_VERSION_TAG=v1.13.1 # Any `git` branch or tag name can be used.
130130
TORCHVISION_VERSION_TAG=v0.14.1
131131
132132
# General environment configurations.
133-
LINUX_DISTRO=ubuntu # Visit the NVIDIA Docker Hub repo for available base images.
134-
DISTRO_VERSION=22.04 # https://hub.docker.com/r/nvidia/cuda/tags
135-
CUDA_VERSION=11.7.1 # Must be compatible with hardware and CUDA driver.
136-
CUDNN_VERSION=8 # Only major version specifications are available.
137-
PYTHON_VERSION=3.10 # Specify the Python version.
138-
MKL_MODE=include # Enable MKL for Intel CPUs.
139-
TZ=Asia/Seoul # Set the container timezone.
133+
LINUX_DISTRO=ubuntu # Visit the NVIDIA Docker Hub repo for available base images.
134+
DISTRO_VERSION=22.04 # https://hub.docker.com/r/nvidia/cuda/tags
135+
CUDA_VERSION=11.7.1 # Must be compatible with hardware and CUDA driver.
136+
CUDNN_VERSION=8 # Only major version specifications are available.
137+
PYTHON_VERSION=3.10 # Specify the Python version.
138+
MKL_MODE=include # Enable MKL for Intel CPUs.
139+
TZ=Asia/Seoul # Set the container timezone.
140+
141+
# Advanced Usage.
142+
TARGET_STAGE=train # Target Dockerfile stage. The `*.whl` files are available in `train-builds`.
140143
```
141144

142145
## General Usage After Initial Installation and Configuration

docker-compose.yaml

Lines changed: 16 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -60,14 +60,12 @@ services:
6060
dockerfile: Dockerfile
6161
args: # Equivalent to `--build-arg`.
6262
BUILD_MODE: ${BUILD_MODE:-exclude}
63-
BUILD_CAFFE2: 0 # Caffe2 disabled for faster build.
64-
BUILD_CAFFE2_OPS: 0
65-
BUILD_TEST: 0
63+
BUILD_TEST: 1 # Enable tests to have identical configurations with deployment.
6664
USE_NNPACK: 0
6765
USE_QNNPACK: 0
6866
LINUX_DISTRO: ${LINUX_DISTRO:-ubuntu}
6967
DISTRO_VERSION: ${DISTRO_VERSION:-22.04}
70-
CUDA_VERSION: ${CUDA_VERSION:-11.7.1}
68+
CUDA_VERSION: ${CUDA_VERSION:-11.8.0}
7169
CUDNN_VERSION: ${CUDNN_VERSION:-8}
7270
PYTHON_VERSION: ${PYTHON_VERSION:-3.10}
7371
MKL_MODE: ${MKL_MODE:-include} # MKL_MODE can be `include` or `exclude`.
@@ -85,12 +83,12 @@ services:
8583
# Fails if `BUILD_MODE=include` but `CCA` is not set explicitly.
8684
TORCH_CUDA_ARCH_LIST: ${CCA}
8785
# Variables for building PyTorch. Must be valid git tags.
88-
PYTORCH_VERSION_TAG: ${PYTORCH_VERSION_TAG:-v1.13.1}
89-
TORCHVISION_VERSION_TAG: ${TORCHVISION_VERSION_TAG:-v0.14.1}
86+
PYTORCH_VERSION_TAG: ${PYTORCH_VERSION_TAG:-v2.0.0}
87+
TORCHVISION_VERSION_TAG: ${TORCHVISION_VERSION_TAG:-v0.15.1}
9088
# Variables for downloading PyTorch instead of building.
91-
PYTORCH_INDEX_URL: ${PYTORCH_INDEX_URL:-https://download.pytorch.org/whl/cu117}
92-
PYTORCH_VERSION: ${PYTORCH_VERSION:-1.13.1}
93-
TORCHVISION_VERSION: ${TORCHVISION_VERSION:-0.14.1}
89+
PYTORCH_INDEX_URL: ${PYTORCH_INDEX_URL:-https://download.pytorch.org/whl/cu118}
90+
PYTORCH_VERSION: ${PYTORCH_VERSION:-2.0.0}
91+
TORCHVISION_VERSION: ${TORCHVISION_VERSION:-0.15.1}
9492
PROJECT_ROOT: ${PROJECT_ROOT:-/opt/project}
9593
GID: ${GID:-1000}
9694
UID: ${UID:-1000}
@@ -132,20 +130,18 @@ services:
132130
volumes: # Place user-specific directories in `docker-compose.override.yaml`.
133131
- .:${PROJECT_ROOT:-/opt/project}
134132
build:
135-
target: deploy
133+
target: ${TARGET_STAGE:-deploy}
136134
context: .
137135
dockerfile: Dockerfile
138136
args:
139137
BUILD_MODE: ${BUILD_MODE:-exclude}
140138
# The Anaconda `defaults` channel is not free for commercial use.
141139
BUILD_TEST: 1 # Enable build tests for deployment.
142-
BUILD_CAFFE2: 1 # Caffe2 should be enabled in production settings.
143-
BUILD_CAFFE2_OPS: 1
144-
USE_NNPACK: 1 # Enable NNPack for deployment.
145-
USE_QNNPACK: 1 # Enable QNNPack for deployment.
140+
USE_NNPACK: 0 # Enable NNPack for deployment if required.
141+
USE_QNNPACK: 0 # Enable QNNPack for deployment if required.
146142
LINUX_DISTRO: ${LINUX_DISTRO:-ubuntu}
147143
DISTRO_VERSION: ${DISTRO_VERSION:-22.04}
148-
CUDA_VERSION: ${CUDA_VERSION:-11.7.1}
144+
CUDA_VERSION: ${CUDA_VERSION:-11.8.0}
149145
CUDNN_VERSION: ${CUDNN_VERSION:-8}
150146
PYTHON_VERSION: ${PYTHON_VERSION:-3.10}
151147
# Requirements must include `mkl` if `MKL_MODE` is set to `include` for deployment.
@@ -155,12 +151,12 @@ services:
155151
CONDA_MANAGER: ${CONDA_MANAGER:-mamba}
156152
TORCH_CUDA_ARCH_LIST: ${CCA} # This will fail if BUILD_MODE=include but CCA is not set explicitly.
157153
# Variables for building PyTorch. Must be valid git tags.
158-
PYTORCH_VERSION_TAG: ${PYTORCH_VERSION_TAG:-v1.13.1}
159-
TORCHVISION_VERSION_TAG: ${TORCHVISION_VERSION_TAG:-v0.14.1}
154+
PYTORCH_VERSION_TAG: ${PYTORCH_VERSION_TAG:-v2.0.0}
155+
TORCHVISION_VERSION_TAG: ${TORCHVISION_VERSION_TAG:-v0.15.1}
160156
# Variables for downloading PyTorch instead of building.
161-
PYTORCH_INDEX_URL: ${PYTORCH_INDEX_URL:-https://download.pytorch.org/whl/cu117}
162-
PYTORCH_VERSION: ${PYTORCH_VERSION:-1.13.1}
163-
TORCHVISION_VERSION: ${TORCHVISION_VERSION:-0.14.1}
157+
PYTORCH_INDEX_URL: ${PYTORCH_INDEX_URL:-https://download.pytorch.org/whl/cu118}
158+
PYTORCH_VERSION: ${PYTORCH_VERSION:-2.0.0}
159+
TORCHVISION_VERSION: ${TORCHVISION_VERSION:-0.15.1}
164160
PROJECT_ROOT: ${PROJECT_ROOT:-/opt/project}
165161
# DEB_OLD: ${DEB_OLD:-http://archive.ubuntu.com}
166162
# DEB_NEW: ${DEB_NEW:-http://mirror.kakao.com}

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ ignore-init-module-imports = true
6666
lines-after-imports = 2
6767

6868
[tool.ruff.pycodestyle]
69-
# PEP8 states sets maximum documentation length to 72 but this is
69+
# PEP8 states sets maximum documentation length to 72 but this is
7070
# too short for many people. Using 80 as in the Google Style Guide.
7171
max-doc-length = 80
7272

reqs/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,15 @@ project root directory because of the `.dockerignore` file.
1313
To use files in other directories,
1414
please modify the `.dockerignore` file.
1515

16+
# Notes on Building PyTorch 1.x
17+
18+
PyTorch v2.x has very different build dependencies from PyTorch v1.x.
19+
While it may have been best to keep all dependencies, the build dependencies
20+
have been cleaned up for the PyTorch v2.x builds to save time and space.
21+
22+
To build legacy PyTorch 1.x versions, copy the requirements from the following
23+
[link](https://github.com/cresset-template/cresset/blob/7568722631a458980b6586ab0799a2e0d6f0a3da/reqs/conda-build.requirements.txt).
24+
1625
## Build Dependency Versions
1726

1827
Edit the package versions in `*-build.requirements.txt` if the latest versions

reqs/conda-build.requirements.txt

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,29 @@
1+
# PyTorch 2.x build-time dependencies. Do not use for PyTorch 1.x compilation.
2+
13
# Do not edit this file (including comments) unless absolutely necessary.
24
# Editing this file will invalidate the Docker build cache for all build layers.
35
# Specify package versions if necessary for the build.
46
# Also, do not add MKL or related packages in this file.
57
astunparse
6-
autoconf
78
ccache
8-
cffi
99
cmake
10-
ffmpeg # Remove this if TorchVision fails to compile.
11-
future
10+
expecttest
11+
filelock
12+
fsspec
1213
git # Needed to get the `git` commit hash, etc.
14+
hypothesis
1315
jemalloc
16+
jinja2
1417
libjpeg-turbo
1518
libpng
1619
lld
20+
networkx
1721
ninja
1822
numpy
19-
# pillow # Not necessary as Pillow-SIMD is used.
20-
pkgconfig
23+
psutil
2124
pyyaml
2225
requests
23-
rsync
24-
setuptools # ==59.5.0 # For older PyTorch versions that use `distutils.version`.
25-
six
26-
typing_extensions
26+
setuptools
27+
sympy
28+
types-dataclasses
29+
typing-extensions

reqs/pip-deploy.requirements.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Lower the MKL version to mkl==2021.4.0 if PyTorch cannot find `libmkl_intel_lp64.so.1`.
2-
# Raise the MKL version to mkl==2022.x.x if PyTorch cannot find `libmkl_intel_lp64.so.2`.
31
# The MKL major version (year) used to build PyTorch must match the version to run it.
42
# Include the appropriate version of the `mkl` package manually if `MKL_MODE=include`.
5-
mkl==2022.1.0
6-
tqdm==4.64.0
3+
# Lower the MKL version to mkl==2021.4.0 if PyTorch cannot find `libmkl_intel_lp64.so.1`.
4+
# Raise the MKL version to mkl==2022.x.x if PyTorch cannot find `libmkl_intel_lp64.so.2`.
5+
mkl==2023.0.0
6+
tqdm==4.65.0

reqs/train-environment.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
# to reduce dependency issues with conda and for greater flexibility.
55
# Manually add dependencies of compiled libraries for reduced
66
# installation with pip.
7+
# Tip: Use `awk 'START_LINE<=NR && FINISH_LINE<=20 reqs/train-environment.yaml'`
8+
# to sort dependencies in the command line while preserving comments, etc.
79
name: base # Always use the `base` environment.
810
channels:
911
- nodefaults # Do not use the default environment.
@@ -16,6 +18,7 @@ dependencies: # Use conda packages if possible.
1618
- libpng # TorchVision dependency.
1719
- numpy # Intel optimized NumPy is not available on PyPI.
1820
- mkl # Essential if BUILD_MODE=include and MKL_MODE=include.
21+
- sympy # A PyTorch dependency.
1922
- tqdm
2023
- typing_extensions # A PyTorch dependency.
2124

@@ -25,11 +28,13 @@ dependencies: # Use conda packages if possible.
2528
- tzdata
2629

2730
# Utility packages.
31+
- attrs
2832
- conda-lock
2933
- git
3034
- htop
35+
- invoke
3136
- lazygit
32-
- monkeytype
37+
- loguru
3338
- nano
3439
- pandera
3540
- parallel

0 commit comments

Comments
 (0)