-
Notifications
You must be signed in to change notification settings - Fork 344
Add basic dockerfile #3740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add basic dockerfile #3740
Conversation
|
Thanks for attempting this. Your timing is a bit awkward because I was writing some Dockerfiles last week but have not gotten around to merging them. You can take a look at this draft pull request if you are curious: #3749 - feel free to comment on it. I have a few comments about what you're doing here:
|
|
I think there are some fair points, but I think you might be missing some benefits of the way I've structured things. In any case, I think getting some form of dockerization out is important, and I'd be happy to work with you to land on something. I'll defer to maintainer preferences, but I'll make some arguments for what I'm doing as well. Small Points
The only source of non-determinism should be the state of the HELM repo. That is by design to allow for the latest version of helm to be the default build. However, setting
I think it is important for the Dockerfile to be part of the repository. I've seen a pattern where people have a repo just for the docker build, and that can make sense in multi-component systems, but for the case where the entire application is a single repo, having a Dockerfile that guarantees the ability to bring that repo into a working executable state seems desirable to me. On My Approach's Complexity
Yes, but the complexity is warranted. Let me explain:
Comparison to 3749's alternative dockerizationThere are also some issues with your docker files:
Issues with my approachI got some external feedback on my approach, and I think there are some improvements that could be made to make it easier / clear how to run custom deployments. This involves documentation for mounting custom experiment configurations and schemas from a local machine, and not perhaps not make the root of the HELP repo the default working directory. Namely, I should add docs with an example of how to mount a custom prod_env and benchmark_outputs. (Still getting familiar with the exact artifact structure of HELM) |
baddb44 to
898eb7a
Compare
|
I made an effort to cleanup the dockerfile by moving all of the docs into an associate README, and I also modified the entrypoint creation code to try and explain why it is needed more clearly. However, my current tests are failing, but it seems like an issue with the current main rather than this branch. In fact, this demonstrates the robustness of this docker image, because even in the current borked main state, we can pin HELM to a known working version and run through the demo, e.g. # Determine version of helm, uv, and python to use
export HELM_GIT_HASH=3f20a6cbb359d36dce028534aa0f2a3809f829dd
export UV_VERSION=0.8.4
export PYTHON_VERSION=3.10
# Build the image with version-specific tags
DOCKER_BUILDKIT=1 docker build --progress=plain \
-t helm:${HELM_GIT_HASH}-uv${UV_VERSION}-python${PYTHON_VERSION} \
--build-arg PYTHON_VERSION=$PYTHON_VERSION \
--build-arg UV_VERSION=$UV_VERSION \
--build-arg HELM_GIT_HASH=$HELM_GIT_HASH \
-f ./dockerfiles/helm.dockerfile .
docker tag helm:${HELM_GIT_HASH}-uv${UV_VERSION}-python${PYTHON_VERSION} helm:latest
mkdir -p ./shared_directory
# Run a benchmark
docker run --rm --gpus=all \
-v $PWD/shared_directory:/mnt/shared_directory \
--workdir /mnt/shared_directory \
-it helm:latest \
helm-run --run-entries mmlu:subject=philosophy,model=openai/gpt2 --suite my-suite --max-eval-instances 10
# Summarize the results:
docker run --rm --gpus=all \
-v $PWD/shared_directory:/mnt/shared_directory \
--workdir /mnt/shared_directory \
-it helm:latest \
helm-summarize --suite my-suite
# Start a web server to view the results:
docker run --rm --gpus=all \
-v $PWD/shared_directory:/mnt/shared_directory \
--workdir /mnt/shared_directory \
-p 8000:8000 \
-it helm:latest \
helm-server --suite my-suiteEDIT: I also optimized the file by using buildkit caches so it doesn't need to re-download apt or uv packages when building on the same machine. I also moved the docker |
|
@yifanmai I could also simplify the main docker image by moving all of the optimized uv stuff I did into a separate image and then have the helm docker image inherit from it. That would reduce the size of the main helm.dockerfile considerably at the cost of a 2 stage build process and an additional file. The separated uv.dockerfile would look similar to this one that I use in my scripts to build CI images: https://gitlab.kitware.com/computer-vision/ci-docker/-/blob/main/uv.dockerfile?ref_type=heads |
In relation to this issue: #2019
I think having a docker image that can recreate reported benchmarks is critical for scientific reproducibility.
This dockerfile is a start for that. In its current form it simply creates an image where the basic helm package is installed in development mode.
It uses UV for efficiency. The idea is that first we install basic apt-package, then setup a controlled version of uv, and then use that to create a base python virtual env that behaves similarly to how a developer would work on a host machine. I update the bashrc and profile so the virtual environment auto-activates when running tasks in the container.
To get helm into the image, I'm assuming you have a local checkout, and it copies over the entire git folder (which makes the image easier to update / develop off of), and then checks out a specified version and does a basic install of dependencies.
Lastly, I make an entrypoint script that ensures any command you send to a
docker runwill be executed in the context of the .bashrc environment.I'm using docker buildkit to get the heredoc style docker syntax, which lets me write a RUN step over multiple lines, which IMO makes it much easier to read / copy / paste into a test container for development and debugging. The last RUN is basically an echo that I just use to bundle some documentation with the docker image on how to build it and how to run the basic tests.
I've verified that with this setup I can run the example in the README and run a server so I can see results on my host machine.
--
Where I want to take this is getting the HEIM benchmarks in an easy state that can be reproduced. Currently I'm having trouble with this, but writing this base image is the first step.