-
-
Notifications
You must be signed in to change notification settings - Fork 678
Description
Summary
Docker image builds have unstable hashes when the image depends on
another Docker image whose image_tags field contains dynamic values
(e.g., timestamps from environment variables). This causes unnecessary
rebuilds and breaks caching.
The Problem
When building Docker images in Pants, the build context hash is
calculated from (build_args, build_env, snapshot.digest). This hash is
used for:
- Cache keys to determine if an image needs rebuilding
- The {pants.hash} interpolation in image tags
- Docker build cache optimization
The issue occurs in this scenario:
- You have a base Docker image (e.g., //base:image)
- You have another Docker image that depends on it via its Dockerfile
(e.g., FROM base:image) - The base image has image_tags that include dynamic values
Example configuration that triggers the problem:
BUILD (root defaults)
__defaults__(
{
"docker_image": dict(
image_tags=[
"{pants.hash}",
env("BRANCH_NAME"),
env("COMMIT_SHA"),
env("TIMESTAMP"), # e.g., "main-20240116-123456"
],
),
}
)
base/BUILD
docker_image(name="image", source="Dockerfile")
app/BUILD
docker_image(name="image", source="Dockerfile")
app/Dockerfile
FROM base:image
# ... rest of dockerfile
What Happens
- When building app:image, Pants builds the upstream base:image first
- The base:image is packaged as a BuiltPackage which includes metadata
containing all the image tags - This metadata (including the timestamp) becomes part of the package
digest - The package digest is included in the build context snapshot for
app:image - Since the timestamp changes on every build, the hash for app:image
changes even if nothing else has changed
Impact
- Broken caching: Images are rebuilt unnecessarily because the hash
changes on every run - Circular dependency with {pants.hash}: The hash depends on tags, but
{pants.hash} tag depends on the hash - CI/CD inefficiency: Builds take longer and use more resources due to
unnecessary rebuilds - Inconsistent image tags: The {pants.hash} value changes even when the
actual image content hasn't changed
Reproduction Steps
- Create two Docker targets where one depends on the other
- Configure image_tags with dynamic values (e.g., using env() with
timestamps) - Run pants package //app:image multiple times
- Observe that the hash changes each time (visible in the {pants.hash}
tag value)
Expected Behavior
The hash should remain stable when:
- No source files have changed
- No dependencies have changed
- No build configuration has changed
The image_tags field should not affect the hash calculation because
tags are metadata about how to publish the image, not part of the image
content itself.
Actual Behavior
The hash changes on every build when upstream Docker images have
dynamic tags, even when nothing else has changed.
Environment
- Pants version: 2.26.1 (and likely affects earlier/later versions)
- Occurs when using Docker image dependencies with dynamic tags
Describe the bug
A clear and concise description of the bug.
Pants version
Which version of Pants are you using?
OS
Are you encountering the bug on MacOS, Linux, or both?
Additional info
Add any other information about the problem here, such as attachments or links to gists, if relevant.