Skip to content

ref(snapshots): Key manifest entries by relative path instead of hash#3241

Merged
NicoHinderling merged 9 commits intomasterfrom
nhi/path-keyed-snapshot-manifest
Mar 25, 2026
Merged

ref(snapshots): Key manifest entries by relative path instead of hash#3241
NicoHinderling merged 9 commits intomasterfrom
nhi/path-keyed-snapshot-manifest

Conversation

@NicoHinderling
Copy link
Contributor

@NicoHinderling NicoHinderling commented Mar 24, 2026

Switch the snapshot manifest dictionary key from content hash to image
file name (basename). This enables the backend to use the key directly
as the image identifier, removing the redundant image_file_name field
from manifest values.

Changes:

  • Manifest entries are now keyed by file name instead of content hash
  • image_file_name removed from ImageMetadata — derived from key
  • content_hash stored in image metadata extra for objectstore lookups
  • Collision detection skips duplicate file names before upload, with a
    warning listing all excluded paths
  • Path separators normalized via path_as_url for cross-platform consistency

Companion backend PR: getsentry/sentry#111467

Switch the snapshot manifest dictionary key from content hash to
relative file path. The content hash is preserved in the image metadata
extra field, enabling the backend to support both key formats during
the transition.

Co-Authored-By: Claude <[email protected]>
@NicoHinderling NicoHinderling marked this pull request as ready for review March 24, 2026 21:34
@NicoHinderling NicoHinderling requested review from a team as code owners March 24, 2026 21:34
NicoHinderling and others added 3 commits March 24, 2026 14:44
Use path_as_url to normalize backslashes to forward slashes on
Windows, ensuring manifest keys are platform-independent.

Co-Authored-By: Claude <[email protected]>
Fixes clippy::str_to_string lint.

Co-Authored-By: Claude <[email protected]>
Switch manifest key from relative path to file name (basename). When
multiple images share the same file name from different directories,
keep the first and warn with the full relative paths of all collisions
so the user can debug and resolve duplicates.

Co-Authored-By: Claude <[email protected]>
The manifest key is now the image file name, so storing it again inside
the value is redundant. Remove the field from ImageMetadata and let
the backend derive it from the key instead.

Co-Authored-By: Claude <[email protected]>
Move the collision check before the file open and upload queue so
colliding images are not uploaded unnecessarily.

Co-Authored-By: Claude <[email protected]>
Use manifest_entries.len() instead of the initial images count so
the success and error messages reflect skipped collisions.

Co-Authored-By: Claude <[email protected]>
The collision warning now lists all paths sharing a filename (kept and
excluded) on one line per group, making it clear which image was retained
and which were skipped.

Co-Authored-By: Claude <[email protected]>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

.or_default()
.push(relative_path);
continue;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Uploading" count includes collision-skipped images

Low Severity

The new collision detection in upload_images() can skip images, making the actual upload count (manifest_entries.len()) smaller than images.len(). The "Uploaded" and error messages inside upload_images() were updated to use manifest_entries.len(), but the "Uploading N image files" message in execute() (line 118) still uses images.len(). When collisions occur, users see e.g. "Uploading 5 image files" followed by "Uploaded 3 image files" — a confusing mismatch that didn't exist before this PR since hash-based keying had no collisions.

Fix in Cursor Fix in Web

@NicoHinderling NicoHinderling merged commit 06cbbda into master Mar 25, 2026
26 checks passed
@NicoHinderling NicoHinderling deleted the nhi/path-keyed-snapshot-manifest branch March 25, 2026 17:38
NicoHinderling added a commit to getsentry/sentry that referenced this pull request Mar 25, 2026
…on (#111467)

Switch the snapshot manifest to use the dict key as the image file name,
removing the redundant `image_file_name` field from `ImageMetadata`.
All consumers now derive the image name from the manifest key instead
of reading it from the value.

Changes:
- Remove `image_file_name` from `ImageMetadata` model
- `categorize_image_diff` uses dict key as the image name
- `_build_base_images` and API endpoint derive file name from key
- `content_hash` field used for objectstore lookups (falls back to key
  for old hash-keyed manifests)
- Tests rewritten to use filename-keyed manifests

Companion CLI PR: getsentry/sentry-cli#3241

Co-authored-by: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants