Skip to content

Changing pants.hash with dynamic tags #22575

@ademariag

Description

@ademariag

Summary

Docker image builds have unstable hashes when the image depends on
another Docker image whose image_tags field contains dynamic values
(e.g., timestamps from environment variables). This causes unnecessary
rebuilds and breaks caching.

The Problem

When building Docker images in Pants, the build context hash is
calculated from (build_args, build_env, snapshot.digest). This hash is
used for:

  1. Cache keys to determine if an image needs rebuilding
  2. The {pants.hash} interpolation in image tags
  3. Docker build cache optimization

The issue occurs in this scenario:

  1. You have a base Docker image (e.g., //base:image)
  2. You have another Docker image that depends on it via its Dockerfile
    (e.g., FROM base:image)
  3. The base image has image_tags that include dynamic values

Example configuration that triggers the problem:

BUILD (root defaults)

  __defaults__(
      {
          "docker_image": dict(
              image_tags=[
                  "{pants.hash}",
                  env("BRANCH_NAME"),
                  env("COMMIT_SHA"),
                  env("TIMESTAMP"),  # e.g., "main-20240116-123456"
              ],
          ),
      }
  )

base/BUILD

  docker_image(name="image", source="Dockerfile")

app/BUILD

  docker_image(name="image", source="Dockerfile")

app/Dockerfile

  FROM base:image
  # ... rest of dockerfile

What Happens

  1. When building app:image, Pants builds the upstream base:image first
  2. The base:image is packaged as a BuiltPackage which includes metadata
    containing all the image tags
  3. This metadata (including the timestamp) becomes part of the package
    digest
  4. The package digest is included in the build context snapshot for
    app:image
  5. Since the timestamp changes on every build, the hash for app:image
    changes even if nothing else has changed

Impact

  • Broken caching: Images are rebuilt unnecessarily because the hash
    changes on every run
  • Circular dependency with {pants.hash}: The hash depends on tags, but
    {pants.hash} tag depends on the hash
  • CI/CD inefficiency: Builds take longer and use more resources due to
    unnecessary rebuilds
  • Inconsistent image tags: The {pants.hash} value changes even when the
    actual image content hasn't changed

Reproduction Steps

  1. Create two Docker targets where one depends on the other
  2. Configure image_tags with dynamic values (e.g., using env() with
    timestamps)
  3. Run pants package //app:image multiple times
  4. Observe that the hash changes each time (visible in the {pants.hash}
    tag value)

Expected Behavior

The hash should remain stable when:

  • No source files have changed
  • No dependencies have changed
  • No build configuration has changed

The image_tags field should not affect the hash calculation because
tags are metadata about how to publish the image, not part of the image
content itself.

Actual Behavior

The hash changes on every build when upstream Docker images have
dynamic tags, even when nothing else has changed.

Environment

  • Pants version: 2.26.1 (and likely affects earlier/later versions)
  • Occurs when using Docker image dependencies with dynamic tags

Describe the bug
A clear and concise description of the bug.

Pants version
Which version of Pants are you using?

OS
Are you encountering the bug on MacOS, Linux, or both?

Additional info
Add any other information about the problem here, such as attachments or links to gists, if relevant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions