Skip to content
Draft
2 changes: 2 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ Package Downloader
.. automodule:: debsbom.download.download
:members:

.. _package-resolving-label:

Package Resolving
-----------------

Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ debsbom documentation
design-decisions
commands
examples
plugins
api

Indices and tables
Expand Down
48 changes: 48 additions & 0 deletions docs/source/plugins.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
Plugins
=======

``debsbom`` provides plugin capability for select functionality.

Resolver Plugin
---------------

In the ``download`` command ``debsbom`` is downloading packages described by an SBOM. For this it needs to resolve from the package to a download location. What resolver to use can be controlled by the ``--resolver`` flag. ``debsbom`` per default provides a resolver for the Debian snapshot mirror (snapshot.debian.org).

Builders of custom Debian distributions might have different repositories where packages can be downloaded from. Some of these solutions might not be publicly available, or its implementation not relevant for the general public for some other reason. In these cases code for a resolver for these repositories should not land in ``debsbom`` proper, but we still want to give the option to use it as a fully integrated part of ``debsbom``.

A resolver plugin provides an additional choice for the ``--resolver`` option, which can be selected in the CLI once the plugin is loaded.

Implementing a Resolver Plugin
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Plugin discovery happens by entry points. ``debsbom`` specifically looks for the ``debsbom.download.resolver`` entry point. The name of the entry point is the name of the resolver, and its content is a setup function for a resolver. The signature of the setup function looks like this:

.. code-block:: python

from request import Session
from debsbom.download.plugin import Resolver

def setup_resolver(session: Session) -> Resolver
pass

The passed in ``request.Session`` is later used by ``debsbom`` to download the packages. It is not required to use it, but consider reusing it instead of opening a new session.

The resolver itself needs to inherit from the ``Resolver`` class. See the documentation here: :ref:`package-resolving-label`. The important part here is implementing the ``resolve`` function, which takes a package representation and returns a list of ``RemoteFile``, the locations from where files associated with the package can be downloaded. A minimal implementation could look like this:

.. code-block:: python

from request import Session
from debsbom.download.plugin import Package, RemoteFile, Resolver, ResolveError

class MyResolver(Resolver):

def resolve(self, pkg: Package) -> list[RemoteFile]:
try:
my_remotefile = get_remotefile(pkg)
except Exception as e:
raise ResolveError
return my_remotefile

All functionality required for implementing a plugin is exposed in the ``debsbom.download.plugin`` module.

A full example implementation can be found in the ``debsbom_plugin_examples`` repository, which is kept up to date for all releases: TODO
43 changes: 33 additions & 10 deletions src/debsbom/commands/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
#
# SPDX-License-Identifier: MIT

from importlib.metadata import version
from importlib.metadata import entry_points, version

from io import BytesIO
import logging
from pathlib import Path
Expand All @@ -19,16 +20,28 @@
import requests
from ..snapshot import client as sdlclient
from ..download.adapters import LocalFileAdapter
from ..download.download import PackageDownloader
from ..download.resolver import PersistentResolverCache, UpstreamResolver
from ..download.download import DownloadStatus, DownloadResult
from ..download.download import PackageDownloader, DownloadStatus, DownloadResult
from ..download.resolver import PackageResolverCache, PersistentResolverCache, ResolveError
except ModuleNotFoundError:
pass


logger = logging.getLogger(__name__)


def setup_snapshot_resolver(session):
sdl = sdlclient.SnapshotDataLake(session=session)
return sdlclient.UpstreamResolver(sdl)


RESOLVERS = {"debian-snapshot": setup_snapshot_resolver}

resolver_endpoints = entry_points(group="debsbom.download.resolver")
for ep in resolver_endpoints:
setup_fn = ep.load()
RESOLVERS[ep.name] = setup_fn


class DownloadCmd(SbomInput, PkgStreamInput):
"""
Processes a SBOM and downloads the referenced packages.
Expand Down Expand Up @@ -72,16 +85,19 @@ def _filter_pkg(
def run(cls, args):
outdir = Path(args.outdir)
outdir.mkdir(exist_ok=True)
cache = PersistentResolverCache(outdir / ".cache")
if cls.has_bomin(args):
resolver = cls.get_sbom_resolver(args)
else:
resolver = cls.get_pkgstream_resolver()
rs = requests.Session()
rs.mount("file:///", LocalFileAdapter())
rs.headers.update({"User-Agent": f"debsbom/{version('debsbom')}"})
sdl = sdlclient.SnapshotDataLake(session=rs)
u_resolver = UpstreamResolver(sdl, cache)
u_resolver = RESOLVERS[args.resolver](rs)
if type(u_resolver.cache) is PackageResolverCache:
cachedir = outdir / ".cache"
cachedir.mkdir(exist_ok=True)
cache = PersistentResolverCache(cachedir / args.resolver)
u_resolver.cache = cache
downloader = PackageDownloader(args.outdir, session=rs)

if args.skip_pkgs:
Expand All @@ -97,11 +113,12 @@ def run(cls, args):
if args.progress:
progress_cb(idx, len(pkgs), pkg.name)
try:
files = list(u_resolver.resolve(pkg))
files = list(u_resolver._resolve_pkg(pkg))
DownloadCmd._check_for_dsc(pkg, files)
downloader.register(files, pkg)
except sdlclient.NotFoundOnSnapshotError:
logger.warning(f"not found upstream: {pkg}")
except ResolveError:
pkg_type = "source" if pkg.is_source() else "binary"
logger.warning(f"failed to resolve {pkg_type} package: {pkg}")
if args.json:
print(
DownloadResult(
Expand Down Expand Up @@ -136,3 +153,9 @@ def setup_parser(cls, parser):
metavar="SKIP",
help="packages to exclude from the download, in package-list format",
)
parser.add_argument(
"--resolver",
choices=RESOLVERS.keys(),
default="debian-snapshot",
help="resolver to use to find upstream packages (default: %(default)s)",
)
1 change: 0 additions & 1 deletion src/debsbom/download/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,5 @@
from .resolver import (
PackageResolverCache,
PersistentResolverCache,
UpstreamResolver,
)
from .download import PackageDownloader
18 changes: 10 additions & 8 deletions src/debsbom/download/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
import os

from ..util.checksum import check_hash_from_path
from .resolver import RemoteFile
from ..dpkg import package
from ..dpkg.package import Package
from ..snapshot.client import RemoteFile

import requests

Expand Down Expand Up @@ -80,8 +80,8 @@ def __init__(
for p in [self.sources_dir, self.binaries_dir]:
p.mkdir(exist_ok=True)

def _target_path(self, f: RemoteFile):
if f.architecture == "source":
def _target_path(self, pkg: package.Package, f: RemoteFile):
if pkg.is_source():
return Path(self.sources_dir / f.archive_name / f.filename)
else:
return Path(self.binaries_dir / f.archive_name / f.filename)
Expand All @@ -94,10 +94,12 @@ def stat(self) -> StatisticsType:
"""
Returns a tuple (files to download, total size, cached files, cached bytes)
"""
unique_dl = list({frozenset(v.checksums.items()): v for _, v in self.to_download}.values())
nbytes = reduce(lambda acc, x: acc + x.size, unique_dl, 0)
cfiles = list(filter(lambda f: self._target_path(f).is_file(), unique_dl))
cbytes = reduce(lambda acc, x: acc + x.size, cfiles, 0)
unique_dl = list(
{frozenset(v.checksums.items()): (pkg, v) for pkg, v in self.to_download}.values()
)
nbytes = reduce(lambda acc, x: acc + x[1].size if x[1].size else 0, unique_dl, 0)
cfiles = list(filter(lambda x: self._target_path(x[0], x[1]).is_file(), unique_dl))
cbytes = reduce(lambda acc, x: acc + x[1].size if x[1].size else 0, cfiles, 0)
return StatisticsType(len(unique_dl), nbytes, len(cfiles), cbytes)

def download(self, progress_cb=None) -> Iterable[DownloadResult]:
Expand All @@ -110,7 +112,7 @@ def download(self, progress_cb=None) -> Iterable[DownloadResult]:
for idx, (pkg, f) in enumerate(self.to_download):
if progress_cb:
progress_cb(idx, len(self.to_download), f.filename)
target = self._target_path(f)
target = self._target_path(pkg, f)
if not target.parent.is_dir():
target.parent.mkdir()
hashable_file_checksums = frozenset(f.checksums.items())
Expand Down
6 changes: 6 additions & 0 deletions src/debsbom/download/plugin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Copyright (C) 2025 Siemens
#
# SPDX-License-Identifier: MIT

from .resolver import RemoteFile, ResolveError, Resolver
from ..dpkg.package import Package
Loading