Skip to content

Conversation

adammcclenaghan
Copy link
Owner

@adammcclenaghan adammcclenaghan commented Feb 13, 2025

Description

In anchore#1809 both ifPackageNameApplies and ifUpstreamPackageNameApplies were updated to use regular expressions for all rule names/upstream names. This was recently updated in anchore#2320 too.

So far as I can see from the codebase only the linux kernel headers ignore rules require regular expression matching.

In my testing I've found that this is significantly slowing down the time FindMatches takes to complete. For one of my test scans with about ~60k packages, the scan is 2x faster with this change and there is ~8GB less cumulative allocations during the scan

Before

Screenshot 2025-02-13 at 11 16 29

After

Screenshot 2025-02-13 at 11 21 11

For now, I've modified ifPackageNameApplies and ifUpstreamPackageNameApplies to handle the linux-headers as special cases.

In future if we wish to support regular expressions more broadly perhaps there should be a nicer interface passed into the likes of ifPackageNameApplies and ifUpstreamPackageNameApplies rather than a string, and this interface could define whether or not to treat the match type as a regex etc...

I've had to make some changes to the unit tests.
I've removed two tests that don't seem to be testing behaviour that Grype currently doesn't have configuration for, namely:

  • there are no existing ignore rules which define kernel-headers.* so I've removed the test ignore on name regex
  • there are no existing ignore rules which define regular expresions that start and end with anchor tags, so I've removed the test ignore on name regex, line termination test match

I've also modified some of the tests to match the actual patterns set out in Grype, namely:
-- linux-.*-headers-.* is now declared linux(-.*)?-headers-.*
-- linux-.* is now defined as linux.* .. as a result of fixing this I've also updated the upstream name from linux to notalinux for the test case. The ignore rules defined today would actually ignore linux, the only reason the test was passing before is because it defined the regex as linux-.*

In lieue of a larger refactor to have a nicer interface for ifPackageNameApplies/ifUpstreamPackageNameApplies which I've previously mentioned, it would at least perhaps be nice to declare the linux(-.*)?-headers-.* and linux.* as a const in some shared place. If the maintainers can suggest somewhere that would be a good fit, I'd be happy to move it.

It looks like only the linux kernel headers ignore rules
make use of pattern matching.

Create and use regex patterns for these cases only to reduce
memory allocations from pattern.compile and improve FindMatches performance.

In future if we wish to support regular expressions more broadly perhaps
then perhaps ignore rules should have some way to define whether they
should be treated as a regular expression or not.

Signed-off-by: Adam McClenaghan <[email protected]>
Signed-off-by: Adam McClenaghan <[email protected]>
Comment on lines 209 to 230
pattern, err := packageNameRegex(name)
if err != nil {
return func(Match) bool { return false }
var (
pattern *regexp.Regexp
err error
)
if strings.HasPrefix(name, "linux") && strings.Contains(name, "-headers-") {
pattern, err = packageNameRegex(name)
}

return func(match Match) bool {
return pattern.MatchString(match.Package.Name)
if err != nil {
// Error compiling pattern
return false
}

if pattern != nil {
// Do regex for linux-headers
return pattern.MatchString(match.Package.Name)
}

// Exact string match for everything else
return name == match.Package.Name

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit but this feels cleaner if we just invert the strings match logic:

func ifPackageNameApplies(name string) ignoreCondition {
	if !(strings.HasPrefix(name, "linux") && strings.Contains(name, "-headers-")) {
		return func(match Match) bool {
			return name == match.Package.Name
		}
	}
	pattern, err := packageNameRegex(name)
	if err != nil {
		return func(Match) bool { return false }
	}
	return func(match Match) bool {
		return pattern.MatchString(match.Package.Name)
	}
}

pattern *regexp.Regexp
err error
)
if strings.HasPrefix(name, "linux") {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how come we dont care about -headers- here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/anchore/grype/blob/main/cmd/grype/cli/commands/root.go#L106C1-L106C161

	{Package: match.IgnoreRulePackage{Name: "linux(-.*)?-headers-.*", [ Type: string(syftPkg.DebPkg)}, MatchType: match.ExactIndirectMatch},

Upstream cares about this part:

UpstreamName: "linux.*",

@@ -677,33 +677,6 @@ func TestApplyIgnoreRules(t *testing.T) {
},
},
},
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reasoning for removing these are fine, but perhaps thats a separate PR?

@peter-mcconnell
Copy link

LGTM 👍

Signed-off-by: Adam McClenaghan <[email protected]>
@adammcclenaghan
Copy link
Owner Author

Opened against upstream:
anchore#2445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants