-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Context
Similar to the example problem mentioned in #277, version streams can be the reason advisory data becomes unlinked from relevant distro packages in various contexts. #277 showed one example of this, where converting an existing distro package to a version stream results in a name change that breaks the link between distro package and advisory data. (And package names can change, and have changed, for other reasons unrelated to version streams.)
However, version streams are an interesting case — more generally than what's accounted for in #277 — because they typically imply we'll keep making more of them. For example, right now for the go "virtual package", we have the version streams go-1.19, go-1.20, and go-1.21. In the case of Go, new minor releases come out roughly every February and August, so in a few more months we'll have go-1.22, and so on into the future.
The separation of a distro package into various version streams is a Wolfi-ism — the upstream project typically doesn't reflect this same splitting of a package's identity. Moreover, it's common for vulnerability data and the analysis we encode in advisory data to relate more to the (virtual) package on the whole, rather than to one or two individual version streams.
In these cases, keeping up with new version streams can quickly become tedious and error-prone. We might have advisory data for existing version streams of a virtual package that applies to all future version streams as well. Today, we need to realize when new version streams are being created and remember to perform the necessary copies to new advisory documents. Doing so creates redundancy, and forgetting to do so creates gaps in the data that quickly affects downstream users and vulnerability scan results.
Proposal
We could add a new field to the advisory document schema that the .package.name field is referring to a virtual package, not a literal package name.
For example, in the case of Go, instead of:
schema-version: "2"
package:
name: go-1.19
advisories:
- id: CVE-2020-29509
# ...and
schema-version: "2"
package:
name: go-1.20
advisories:
- id: CVE-2020-29509
# ...and
schema-version: "2"
package:
name: go-1.21
advisories:
- id: CVE-2020-29509
# ...... we could just have:
schema-version: 2.0.1
package:
name: go
virtual: true
advisories:
- id: CVE-2020-29509
# ...Then, this data, in combination with data taken from distro package definitions, could result in downstream artifacts where our concise advisory data representation gets "inflated" to instead describe all literal packages when needed. For example, in the Go example above, we could still produce a secdb with entries for each of go-1.19, go-1.20, and go-1.21.
An alternative implementation is that we wouldn't need a new field, but instead we'd leverage the APK semantics of package search, where a .package.name value of go would link to matching packages that provided go=....