-
Notifications
You must be signed in to change notification settings - Fork 75
[for discussion, POC]: make loader arguments discoverable #1419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
029742d
40fb8be
95dba81
ea03372
b36ba54
beb1b9a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,6 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import argparse | ||
| import os | ||
| import urllib.parse | ||
| from pathlib import Path | ||
|
|
@@ -82,6 +83,26 @@ def __init__( | |
| dict(urllib.parse.parse_qsl(parsed_path.query, keep_blank_values=True)) if parsed_path else {} | ||
| ) | ||
|
|
||
| self._parser = self._create_parser(self.__class__) | ||
|
|
||
| @classmethod | ||
| def print_help(cls) -> None: | ||
| """Prints the help message for this loader's specific arguments.""" | ||
| cls._create_parser(cls).print_help() | ||
|
|
||
| @staticmethod | ||
| def _create_parser(cls: type[Loader]) -> argparse.ArgumentParser: | ||
| """Creates the argument parser for this loader.""" | ||
| # Do like generate_argparse_for_method in cli.py | ||
| parser = argparse.ArgumentParser( | ||
| prog=f"loader:{cls.__name__.lower()}", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be better to keep it in line with how the help is done for functions E.g, like: as using which is a bit odd for a user perspective
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added comment, can be deferred when we implement it IMO |
||
| description=f"Options for the '{cls.__name__}' loader.", | ||
| add_help=False, | ||
| ) | ||
| for args, kwargs in getattr(cls, "__args__", []): | ||
| parser.add_argument(*args, **kwargs) | ||
| return parser | ||
|
|
||
| def __repr__(self) -> str: | ||
| return f"{self.__class__.__name__}({str(self.path)!r})" | ||
|
|
||
|
|
@@ -115,6 +136,20 @@ def find_all(path: Path, parsed_path: urllib.parse.ParseResult | None = None) -> | |
| yield path | ||
|
|
||
| def map(self, target: Target) -> None: | ||
| """Wrapper around the _map function that handles argument passing.""" | ||
| # The loader parses its own arguments from argv | ||
| loader_options, _ = self._parser.parse_known_args() | ||
|
|
||
| # The query string can act as a default for any arguments not provided on the command line | ||
| for key, value in self.parsed_query.items(): | ||
| # argparse sets missing optional arguments to None. We only want to override if it's None. | ||
| if getattr(loader_options, key, None) is None: | ||
| setattr(loader_options, key, value) | ||
|
|
||
| # Pass the parsed options directly to the implementation map function | ||
| return self._map(target, **vars(loader_options)) | ||
|
|
||
| def _map(self, target: Target, **kwargs) -> None: | ||
| """Maps the loaded path into a ``Target``. | ||
|
|
||
| Args: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,6 +15,7 @@ | |
| from dissect.target.exceptions import LoaderError | ||
| from dissect.target.filesystems.dir import DirectoryFilesystem | ||
| from dissect.target.loader import Loader | ||
| from dissect.target.plugin import arg | ||
|
|
||
| if TYPE_CHECKING: | ||
| from collections.abc import Iterator | ||
|
|
@@ -38,6 +39,12 @@ | |
| WINDOWS_DRIVE_FIXED = 3 | ||
|
|
||
|
|
||
| @arg("--force-directory-fs", action="store_true", help="Force the use of DirectoryFilesystem on all drives.") | ||
| @arg( | ||
| "--fallback-to-directory-fs", | ||
| action="store_true", | ||
| help="Fallback to DirectoryFilesystem if a filesystem cannot be opened.", | ||
| ) | ||
| class LocalLoader(Loader): | ||
| """Load local filesystem.""" | ||
|
|
||
|
|
@@ -49,21 +56,18 @@ def __init__(self, path: Path, **kwargs): | |
| def detect(path: Path) -> bool: | ||
| return urllib.parse.urlparse(str(path)).path == "local" | ||
|
|
||
| def map(self, target: Target) -> None: | ||
| def _map(self, target: Target, force_directory_fs: bool = False, fallback_to_directory_fs: bool = False) -> None: | ||
| # For the local loader we abuse the path/URI parsing a bit, so fix it up here | ||
| target.parsed_path = urllib.parse.urlparse(str(target.path)) | ||
| target.path_query = self.parsed_query | ||
| target.path = Path("local") | ||
|
|
||
| os_name = _get_os_name() | ||
|
|
||
| force_dirfs = "force-directory-fs" in self.parsed_query | ||
| fallback_to_dirfs = "fallback-to-directory-fs" in self.parsed_query | ||
|
Comment on lines
-60
to
-61
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this would force this new method of doing these command line arguments where acquire will stop functioning properly as it uses these query parameters. https://github.com/fox-it/acquire/blob/c109fe4f5e642078931fc8c3d80ab7f39e1c496f/acquire/acquire.py#L2433
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can keep both approaches if we combine the However, I also think it is a bit weird to apply "stringified" arguments when using target from acquire. Perhaps we can get rid of this URI scheme altogether. The question then becomes how to pass arguments to the loader programmatically. |
||
|
|
||
| if os_name == "windows": | ||
| map_windows_mounted_drives(target, force_dirfs=force_dirfs, fallback_to_dirfs=fallback_to_dirfs) | ||
| map_windows_mounted_drives(target, force_dirfs=force_directory_fs, fallback_to_dirfs=fallback_to_directory_fs) | ||
| else: | ||
| if fallback_to_dirfs or force_dirfs: | ||
| if fallback_to_directory_fs or force_directory_fs: | ||
| # Where Windows does some sophisticated fallback, for other | ||
| # operating systems we don't know anything yet about the | ||
| # relation between disks and mount points. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,6 +18,7 @@ | |
| ) | ||
| from dissect.target.helpers import cache, record_modifier | ||
| from dissect.target.helpers.logging import get_logger | ||
| from dissect.target.loader import LOADERS_BY_SCHEME | ||
| from dissect.target.plugin import ( | ||
| PLUGINS, | ||
| FunctionDescriptor, | ||
|
|
@@ -105,7 +106,16 @@ def main() -> int: | |
|
|
||
| # Show help for target-query | ||
| if not args.function and ("-h" in rest or "--help" in rest): | ||
| parser.print_help() | ||
| if not args.loader: | ||
| parser.print_help() | ||
| return 0 | ||
|
|
||
| if (loader_cls := LOADERS_BY_SCHEME.get(args.loader, None)) is None: | ||
| print(f"Error: Loader '{args.loader}' not found.") | ||
| return 1 | ||
|
Comment on lines
+109
to
+115
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. moving this check to e.g.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can do this in the proper implementation |
||
|
|
||
| # The loader now knows how to print its own help. | ||
| loader_cls.print_help() | ||
| return 0 | ||
|
|
||
| process_generic_arguments(parser, args) | ||
|
|
@@ -126,6 +136,7 @@ def main() -> int: | |
| ) | ||
|
|
||
| # Process plugin arguments after host and child args are checked | ||
| # This is for plugins, not loaders. We pass `rest` to let it find plugin args. | ||
| different_output_types = process_plugin_arguments(parser, args, rest) | ||
Miauwkeru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| if not args.targets: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not quite sure we want to create the parser in the
__init__of the loader. As it feels (to me) that the argument parser should be created / handled before the loader is initiated.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous objection was that showing the help message required instantiation of the loader, which is no longer the case?
From an architectural standpoint, parsing arguments typically is done in a different (outermost) layer of a program. Is that the reason you want it done before the loader is initiated?
I don't necessarily disagree with that, but plugin arguments are also processed pretty "late":
dissect.target/dissect/target/tools/utils/cli.py
Line 424 in 95dba81
Moreover, originally the parsing of URI arguments was also done "late":
[target.parsed_path = urllib.parse.urlparse(str(target.path))](
dissect.target/dissect/target/loaders/local.py
Line 54 in f3bf586
TLDR I think the poc is in line what is already there, while simultaneously agreeing it is architecturally unsound. If we change it, I propose we moving the parsing of arguments to the outermost layer of the program for all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am moving the argument parsing out of the loader. One of the challenges is that to parse the loader arguments earlier, I need to know which loader will be used. However, the code that selects a loader is pretty deep in
Target:dissect.target/dissect/target/target.py
Lines 343 to 358 in beb1b9a
We want to ensure that we are using the same loader when parsing arguments and when opening the target. So this code needs to be refactored.
I see these two options, there might be others.
Create a factory class for building targets. This class would given a uri_string select the correct loader class, parse the loader arguments, instantiate the loader, and inject it into the target.
Extract the loader detection logic into a function
detect_loaderand let the outer layer do the orchestration.Targetno longer needs to guess the loader, but the loader is injected into the target.