Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions dissect/target/loader.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from __future__ import annotations

import argparse
import os
import urllib.parse
from pathlib import Path
Expand Down Expand Up @@ -82,6 +83,26 @@ def __init__(
dict(urllib.parse.parse_qsl(parsed_path.query, keep_blank_values=True)) if parsed_path else {}
)

self._parser = self._create_parser(self.__class__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not quite sure we want to create the parser in the __init__ of the loader. As it feels (to me) that the argument parser should be created / handled before the loader is initiated.

Copy link
Contributor Author

@twiggler twiggler Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous objection was that showing the help message required instantiation of the loader, which is no longer the case?

From an architectural standpoint, parsing arguments typically is done in a different (outermost) layer of a program. Is that the reason you want it done before the loader is initiated?

I don't necessarily disagree with that, but plugin arguments are also processed pretty "late":

known_args, _ = parser.parse_known_args(args)

Moreover, originally the parsing of URI arguments was also done "late":
[target.parsed_path = urllib.parse.urlparse(str(target.path))](

target.parsed_path = urllib.parse.urlparse(str(target.path))
)

TLDR I think the poc is in line what is already there, while simultaneously agreeing it is architecturally unsound. If we change it, I propose we moving the parsing of arguments to the outermost layer of the program for all cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am moving the argument parsing out of the loader. One of the challenges is that to parse the loader arguments earlier, I need to know which loader will be used. However, the code that selects a loader is pretty deep in Target:

adjusted_path, parsed_path = parse_path_uri(spec)
# We always need a path to work with, so convert the spec into one if it's not one already
path = Path(spec) if not isinstance(spec, os.PathLike) else spec
if parsed_path is not None and (loader_cls := loader.find_loader_by_scheme(parsed_path.scheme)):
# If we find a loader by URI scheme, use the adjusted path (path component of the URI)
found_path = adjusted_path
elif loader_cls := loader.find_loader(path, fallbacks=[loader.DirLoader, loader.RawLoader]):
# Otherwise try to find a loader for the "raw" path
# If we succeed, upgrade the "spec" to the path
spec = path
found_path = path
parsed_path = None
else:
# Couldn't find a loader
return

We want to ensure that we are using the same loader when parsing arguments and when opening the target. So this code needs to be refactored.

I see these two options, there might be others.

  1. Create a factory class for building targets. This class would given a uri_string select the correct loader class, parse the loader arguments, instantiate the loader, and inject it into the target.

  2. Extract the loader detection logic into a function detect_loader and let the outer layer do the orchestration. Target no longer needs to guess the loader, but the loader is injected into the target.


@classmethod
def print_help(cls) -> None:
"""Prints the help message for this loader's specific arguments."""
cls._create_parser(cls).print_help()

@staticmethod
def _create_parser(cls: type[Loader]) -> argparse.ArgumentParser:
"""Creates the argument parser for this loader."""
# Do like generate_argparse_for_method in cli.py
parser = argparse.ArgumentParser(
prog=f"loader:{cls.__name__.lower()}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to keep it in line with how the help is done for functions

E.g, like:

$ target-query -f activitiescache -h
usage: target-query -f activitiescache [-h]

as using -h for the loader will now not give any information on how to use it with the tool:

$ target-query -L local -h
usage: loader:localloader [--force-directory-fs] [--fallback-to-directory-fs]

which is a bit odd for a user perspective

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment, can be deferred when we implement it IMO

description=f"Options for the '{cls.__name__}' loader.",
add_help=False,
)
for args, kwargs in getattr(cls, "__args__", []):
parser.add_argument(*args, **kwargs)
return parser

def __repr__(self) -> str:
return f"{self.__class__.__name__}({str(self.path)!r})"

Expand Down Expand Up @@ -115,6 +136,20 @@ def find_all(path: Path, parsed_path: urllib.parse.ParseResult | None = None) ->
yield path

def map(self, target: Target) -> None:
"""Wrapper around the _map function that handles argument passing."""
# The loader parses its own arguments from argv
loader_options, _ = self._parser.parse_known_args()

# The query string can act as a default for any arguments not provided on the command line
for key, value in self.parsed_query.items():
# argparse sets missing optional arguments to None. We only want to override if it's None.
if getattr(loader_options, key, None) is None:
setattr(loader_options, key, value)

# Pass the parsed options directly to the implementation map function
return self._map(target, **vars(loader_options))

def _map(self, target: Target, **kwargs) -> None:
"""Maps the loaded path into a ``Target``.

Args:
Expand Down
16 changes: 10 additions & 6 deletions dissect/target/loaders/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
from dissect.target.exceptions import LoaderError
from dissect.target.filesystems.dir import DirectoryFilesystem
from dissect.target.loader import Loader
from dissect.target.plugin import arg

if TYPE_CHECKING:
from collections.abc import Iterator
Expand All @@ -38,6 +39,12 @@
WINDOWS_DRIVE_FIXED = 3


@arg("--force-directory-fs", action="store_true", help="Force the use of DirectoryFilesystem on all drives.")
@arg(
"--fallback-to-directory-fs",
action="store_true",
help="Fallback to DirectoryFilesystem if a filesystem cannot be opened.",
)
class LocalLoader(Loader):
"""Load local filesystem."""

Expand All @@ -49,21 +56,18 @@ def __init__(self, path: Path, **kwargs):
def detect(path: Path) -> bool:
return urllib.parse.urlparse(str(path)).path == "local"

def map(self, target: Target) -> None:
def _map(self, target: Target, force_directory_fs: bool = False, fallback_to_directory_fs: bool = False) -> None:
# For the local loader we abuse the path/URI parsing a bit, so fix it up here
target.parsed_path = urllib.parse.urlparse(str(target.path))
target.path_query = self.parsed_query
target.path = Path("local")

os_name = _get_os_name()

force_dirfs = "force-directory-fs" in self.parsed_query
fallback_to_dirfs = "fallback-to-directory-fs" in self.parsed_query
Comment on lines -60 to -61
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would force this new method of doing these command line arguments where acquire will stop functioning properly as it uses these query parameters. https://github.com/fox-it/acquire/blob/c109fe4f5e642078931fc8c3d80ab7f39e1c496f/acquire/acquire.py#L2433

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep both approaches if we combine the force_directory_fs argument with the one from self.parsed_query.

However, I also think it is a bit weird to apply "stringified" arguments when using target from acquire. Perhaps we can get rid of this URI scheme altogether. The question then becomes how to pass arguments to the loader programmatically.


if os_name == "windows":
map_windows_mounted_drives(target, force_dirfs=force_dirfs, fallback_to_dirfs=fallback_to_dirfs)
map_windows_mounted_drives(target, force_dirfs=force_directory_fs, fallback_to_dirfs=fallback_to_directory_fs)
else:
if fallback_to_dirfs or force_dirfs:
if fallback_to_directory_fs or force_directory_fs:
# Where Windows does some sophisticated fallback, for other
# operating systems we don't know anything yet about the
# relation between disks and mount points.
Expand Down
13 changes: 12 additions & 1 deletion dissect/target/tools/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
)
from dissect.target.helpers import cache, record_modifier
from dissect.target.helpers.logging import get_logger
from dissect.target.loader import LOADERS_BY_SCHEME
from dissect.target.plugin import (
PLUGINS,
FunctionDescriptor,
Expand Down Expand Up @@ -105,7 +106,16 @@ def main() -> int:

# Show help for target-query
if not args.function and ("-h" in rest or "--help" in rest):
parser.print_help()
if not args.loader:
parser.print_help()
return 0

if (loader_cls := LOADERS_BY_SCHEME.get(args.loader, None)) is None:
print(f"Error: Loader '{args.loader}' not found.")
return 1
Comment on lines +109 to +115
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving this check to e.g. process generic arguments function in tools/utils/cli.py will add this behaviour to all target-* tooling instead of just query

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this in the proper implementation


# The loader now knows how to print its own help.
loader_cls.print_help()
return 0

process_generic_arguments(parser, args)
Expand All @@ -126,6 +136,7 @@ def main() -> int:
)

# Process plugin arguments after host and child args are checked
# This is for plugins, not loaders. We pass `rest` to let it find plugin args.
different_output_types = process_plugin_arguments(parser, args, rest)

if not args.targets:
Expand Down
Loading