-
Notifications
You must be signed in to change notification settings - Fork 60
Changes for BoefjeScheduler
to support deduplication
#4309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
# Is the ooi in other organisations? When the ooi is in other | ||
# organisations we need to create tasks for those organisations as well. | ||
if boefje_task.input_ooi is not None: | ||
# FIXME: is the ooi shared between organisations, or is it a copy? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOI information is not shared between orga's. However, since they are the same (by primary key) most of their information is exactly the same. Small differences might still exists in properties that are present on the OOI but which are not used in the primary_key
|
||
# FIXME: what if boefje is disabled in the other org? Are we sure | ||
# that in that org the boefje is allowed to scan the ooi? | ||
# FIXME: what status do we give it do we give it completed, we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say a new state Deduplicated would fit the bill? We could then change it to 'complete' or 'error' once the actual running job completes/errors.
# FIXME: what status do we give it do we give it completed, we | ||
# can't make it queued since it shouldn't be picked up by | ||
# the task runner. | ||
# FIXME: how to these tasks get associated with the other raw files? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we discusses to upload the raw files for each job, and as such we'd create normalizer jobs in each subsequent organizaton (which we might deduplicate later on).
Uploaded raw files with the same content can transparently be deduplicated (by content hash) in Bytes on disk.
* main: (23 commits) Updated some packages (#4364) Update URL to docs in makefile (#4346) Fix/catch information source errors when filling/updating the rocky knowledge base (#4347) Translations update from Hosted Weblate (#4363) Styling changes to meet the design (#4263) Fix scheduled reports view showing reports for all organizations (#4351) remove unneeded task statistics for generic task showing pages (#4344) Translations update from Hosted Weblate (#4353) Fix weblate by merging all pending translations (#4348) Shows the current plugin state to users who cannot enable/disable plugins themselves. (#4326) Fix broken normaliser list view link in plugins.html (#4331) Fixes toc layout on the docs (#4341) Updated `django_compressor` (#4342) Update QA testplan to add multiple organizations (#4338) Ignore incorrect type assumption from mypy (#4337) Change OOI types for findings report (#4184) Findings dashboard for all organizations (#4007) Update kat_finding_types.json, add more in dept details (#4316) Add changes from #4312 (#4319) Python 3.10 compatibility for datetime parsing in report flow (#4302) ...
BoefjeScheduler
to support deduplication
Warning
This is a pre-review and should not be merged
Changes
Before we push a task on the queue, we will create additional tasks for organisations that have the same ooi present. We batch those task based on the environment settings hash so that the task runner can pick them up in one go and de-duplicate additional similar scan on the same ooi.
env_hash
match, if so task should be created and is a de-duplicated taskIssue link
Closes #4359
QA notes
Please add some information for QA on how to test the newly created code.
Code Checklist
.env
changes files if required and changed the.env-dist
accordingly.Checklist for code reviewers:
Copy-paste the checklist from the docs/source/templates folder into your comment.
Checklist for QA:
Copy-paste the checklist from the docs/source/templates folder into your comment.