Skip to content

Conversation

DmitriyLewen
Copy link
Contributor

@DmitriyLewen DmitriyLewen commented Nov 6, 2024

Description

There are cases when report contains Packages with same GAV (GroupID, ArtifactID, version).
But these are different packages (see #7824 (comment)).

To avoid confusing and build dependency graph correctly, we need to use UUID for each Package from pom.xml files.

This solution also fixes problem with relationships in SBOM formats for this case (see #7824 (comment))

PR blocker - #7889

Related issues

Related PR

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@DmitriyLewen DmitriyLewen self-assigned this Nov 6, 2024
@DmitriyLewen DmitriyLewen marked this pull request as ready for review December 2, 2024 09:42
@DmitriyLewen DmitriyLewen changed the title refactor: use UUID for Packages from pom.xml files. refactor: use UUID for Packages IDs from pom.xml files. Dec 3, 2024
@@ -148,7 +148,8 @@ func (p *Parser) parseRoot(root artifact, uniqModules map[string]struct{}) ([]ft
if _, ok := uniqModules[art.String()]; ok {
continue
}
uniqModules[art.String()] = struct{}{}
art.ID = uuid.New()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want the output reports to be as reproducible as possible. I'm considering another approach, but I have not yet come up with...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean using something like a hash instead of UUID?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one of the options. Regardless of the method, I want the same value to be output as much as possible when scanning the same object.

Copy link

This PR is stale because it has been labeled with inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Feb 24, 2025
@DmitriyLewen DmitriyLewen removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Feb 24, 2025
},
},
{
name: "multi module with similar deps, but different children",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knqyf263 I updated logic to use hash instead of UUID and found 1 issue.

I added this test case to show:

Based on our logic (#7824 (comment)), we should use different IDs for all dependencies (because they can use different properties, depManagemets, etc.) - see org.example:example-dependency in this example.

But we use GAV in cache key only, which leads to artifact reuse and wrong packages.

I see 2 ways:

  • abandon cache
  • save pom.xml files before analyze to cache (pom.xml instead of analysisResult). In this case we will gain time for reading pom.xml files, but I'm not sure that it will help much.

@DmitriyLewen DmitriyLewen changed the title refactor: use UUID for Packages IDs from pom.xml files. refactor: use UUID/hash for Packages IDs from pom.xml files. Mar 4, 2025
Copy link

github-actions bot commented May 4, 2025

This PR is stale because it has been labeled with inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label May 4, 2025
@DmitriyLewen DmitriyLewen removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label May 5, 2025
Copy link

github-actions bot commented Jul 5, 2025

This PR is stale because it has been labeled with inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Jul 5, 2025
@github-actions github-actions bot closed this Jul 25, 2025
@DmitriyLewen DmitriyLewen reopened this Jul 25, 2025
@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Jul 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug(sbom): Duplicate SBOM packages for multi-module pom.xml files
2 participants