Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Acquire collection loader #935

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

Matthijsy
Copy link
Contributor

This PR adds a loader for ZIP files, which is mainly useful for ZIP based acquire collects.

I don't have experience with creating loaders, and mainly took the logic from the TarLoader. So if there are things that can be done better would be happy to improve!

fixes #934

@Horofic Horofic self-assigned this Nov 6, 2024
@Horofic
Copy link
Contributor

Horofic commented Nov 7, 2024

First of all, thanks for you contribution @Matthijsy. I think this is a good first attempt :)!

As you kind of pointed out, the ZIPloader mainly took the logic from the TarLoader with some minor tweaks. I think a better way forward would be to create an AcquireLoader that works on both Zip, Tar and Dir based Acquire collections! Kind of like the VelociraptorLoader found here. Here it does some specific checks to see if a collection is
is a Velociraptor collection.

Let me know what you think about this!

@@ -0,0 +1,97 @@
import logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add tests for this new functionality? This could be done later, if you decide to go on the AcquireLoader route.

@Matthijsy
Copy link
Contributor Author

@Horofic Thank you for the feedback, that indeed sounds like a good plan. I have now changed it to a dedicated AcquireLoader. This way it seems to work for both the zip and directory types. However, the directory mapping for the tar version does not work yet. When I test this using target-shell the root directories look like this

c:
fs0
fs1

I suspect that the find_and_map_dirs does not handle the tar format correctly, but I cannot figure out how to add support for that. Could you give me some pointers in how to do that?

Furthermore, I suspect this will not work anymore for the legacy acquire format as described in the TarLoader logic. Is this still something that should be supported?

@Horofic Horofic changed the title Add ZIP loader Add Acquire collection loader Nov 19, 2024
@Horofic
Copy link
Contributor

Horofic commented Dec 4, 2024

Posting this comment for documentation purposes. @Matthijsy and I had a discussion outside of this PR regarding possible approaches for this AcquireLoader.

@Matthijsy one possible approach we haven’t gone over is described and implemented in PR #700 by @Zawadidone. I believe some changes have to be made to the current TarFilesystem implementation to be able to go this route.

@Matthijsy
Copy link
Contributor Author

I made the find_and_map_dirs function now also work for TarFilesystem and now it works! Did a test for the legacy acquire format and that is handled by this function already. Furthermore I believe that the case for running it against directories with acquire collects is already covered by the DirLoader

@Horofic If you have time could you have a look? Would be great to see if this now covers all the different cases

@Matthijsy
Copy link
Contributor Author

I have also added (or mainly re-used) a few test cases now. Only the test_tar_sensitive_drive_letter and test_tar_anonymous_filesystems don't work anymore. I don't get the cases they try to test, is that something related to Acquire of for the TarLoader?

Copy link
Contributor

@Horofic Horofic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a initial look. Some of the uncommented tests still seem to fail, could you fix those? Additionally the test that are commented are preferred to stay enabled. I provided some context per test, so adding those cases to the AcquireLoader logic might be easier. Ill take another look later this week.

Again, if you have any questions, please shoot!

tests/loaders/test_acquire.py Show resolved Hide resolved
tests/loaders/test_acquire.py Outdated Show resolved Hide resolved
tests/loaders/test_acquire.py Outdated Show resolved Hide resolved
tests/loaders/test_acquire.py Outdated Show resolved Hide resolved
dissect/target/loaders/dir.py Outdated Show resolved Hide resolved
dissect/target/loaders/acquire.py Outdated Show resolved Hide resolved
dissect/target/loaders/acquire.py Outdated Show resolved Hide resolved
if not root:
return False

return root.joinpath(FILESYSTEMS_ROOT).exists() or root.joinpath(FILESYSTEMS_LEGACY_ROOT).exists()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be very slow on large tar files, where there will be a double performance penalty. First the entire tar file will be parsed just for this check, then if it exists, it will be parsed again in __init__. However, if it doesn't exist, it will still be parsed again in the actual tar loader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree, but is there a way to prevent that? As we do need to check if the fs directory exists in the file, since otherwise the tar is not an acquire collect and thus need to be handled by the TarLoader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Ideally" (not really, since it's only for performance and the code path itself will be more confusing) it's a subpath in the zip and tar loaders. So you piggy back on the detection logic of those, and upon mapping you do a fast check to see if you need to divert logic to Acquire logic (which could exist in loaders/acquire.py.

At least, that's the "best" idea I can come up with right now.

tests/loaders/test_acquire.py Outdated Show resolved Hide resolved
dissect/target/loaders/acquire.py Outdated Show resolved Hide resolved
dissect/target/loaders/dir.py Show resolved Hide resolved
dissect/target/loaders/dir.py Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this file is not added to the git LFS. Could you add it please?

Comment on lines +142 to +145
if p.name == "sysvol":
dirs.append(('c', p))
else:
dirs.append((p.name[0], p))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes what the function returns. Not bad necessarily, but it might break some other loaders that use this function. So those would need to be changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot process ZIP based acquire collects
3 participants