-
Notifications
You must be signed in to change notification settings - Fork 9
Description
From @Lnaden (moved from #215):
If we want TexLive for doing doc building, we have to make at least 1 image layer to install that.
The cudatoolkit is a bit more complicated due to wall clock time, container size, and CUDA versions.
We could pull directly from condaforge/linux-anvil-cuda:{CUDA_VERSION_X.Y} and get the cudatoolkit that way, however, each of those images range from >1.5 to 3.4 GB, so the time to docker pull each of those images for every one of our builds is significant.
The images I built tried to bare-bones install just the parts of each CUDA version we needed on each image by stripping out just the essential RPM's OpenMM needed for headers. That kept the images all under 1.5 GB, making the CI boot time much faster.
We could just roll 1 image and install the CUDA toolkit on demand, but that requires pulling it from the NVIDIA servers (can be slow, anecdotally) or some other mirror, and the toolkits themselves are bloated messes. E.g. Version 7.5 was 1.1 GB on its own, 11.6 is 3.5 GB, which then runs the same problem as the existing images + install time (albeit with less maintenance overhead since we won't create the image)
Lastly, using the cudatoolkit conda package or image, we only have a subset of versions OpenMM supports and builds against.
| CUDA Version | OpenMM Supported | On CF/Image |
|---|---|---|
| 8.0 | X | |
| 9.0 | X | |
| 9.1 | X | |
| 9.2 | X | X |
| 10.0 | X | X |
| 10.1 | X | X |
| 10.2 | X | X |
| 11.0 | X | X |
| 11.1 | X | |
| 11.2 | X | |
| 11.3 | X | |
| 11.4 | X | |
| 11.5 | X | |
| 11.6 | X |
The later 11.X can be fixed, but the pre 9.2 ones cannot.