Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
8c01058
Initial
jpbruinsslot Apr 10, 2025
ca006ce
Update
jpbruinsslot Apr 10, 2025
2d9008d
Update
jpbruinsslot Apr 16, 2025
c2531ad
Make test work
jpbruinsslot Apr 17, 2025
ea2250c
Restructure method and update tests
jpbruinsslot Apr 22, 2025
164e4e2
Update tests
jpbruinsslot Apr 23, 2025
d2913dd
Merge branch 'main' into feature/mula/dedup
jpbruinsslot Apr 23, 2025
3bd4c0e
Precommit
jpbruinsslot Apr 23, 2025
8171700
Fix tests
jpbruinsslot May 7, 2025
923cd43
Add additional tests
jpbruinsslot May 8, 2025
8a9ed33
Add changes to support changes in katalogus
jpbruinsslot May 8, 2025
1a9abf4
Update
jpbruinsslot May 8, 2025
bd7f34b
Naive implementation
jpbruinsslot May 13, 2025
687a9b2
Rewriting tests
jpbruinsslot May 13, 2025
7f8d5a2
Update tests
jpbruinsslot May 13, 2025
ef52396
Fix tests
jpbruinsslot May 13, 2025
b5308f7
Fix tests
jpbruinsslot May 14, 2025
abb00eb
Fix precommit
jpbruinsslot May 14, 2025
3cf78ad
Bump django from 5.0.13 to 5.0.14 in /rocky (#4281)
dependabot[bot] Apr 23, 2025
fb83669
Translations update from Hosted Weblate (#4374)
weblate Apr 23, 2025
f40d7c0
fix permissions on report_overview.py (#4264)
underdarknl Apr 24, 2025
26898be
Add quick start to docs.openkat.nl (#4349)
stephanie0x00 Apr 25, 2025
4181f30
add observed_at to links on finding_list.html (#4367)
underdarknl Apr 28, 2025
215f995
Update packages (#4399)
ammar92 Apr 29, 2025
35d7d64
Remove unused scan profile increment queues (#4383)
dekkers Apr 29, 2025
a2fbbc5
Add organisation queryparam for schedules endpoint (#4396)
jpbruinsslot Apr 29, 2025
e17d619
Upgrade jaeger and prometheus, and enable spm (#4282)
jpbruinsslot Apr 29, 2025
8f9dad9
Add all organization report task page (#4394)
dekkers Apr 29, 2025
17d65d6
Make the list of boefjes unqiue when querying the KATalogus for info …
underdarknl Apr 29, 2025
6502e8a
Feat/cleaner set scan profile form (#4345)
underdarknl Apr 30, 2025
fda531d
Hotfix for NoReverseMatch in Crisis Room (#4405)
madelondohmen May 1, 2025
edf9c48
(temp) fix time parsing in report_overview.py (#4402)
underdarknl May 1, 2025
da55ba4
Fixed link in tree view (#4404)
ammar92 May 1, 2025
ef1f689
Use Python 3.13 as default Python version in container images and CI …
dekkers May 1, 2025
e031f60
Update plugin tiles when user has no permission to enable/disable (#4…
madelondohmen May 1, 2025
176195f
Remove leftover debug logging (#4418)
dekkers May 1, 2025
bb57fa2
Add grafana pyroscope continuous profiling (#4297)
jpbruinsslot May 2, 2025
ad74040
Updated packages (#4433)
ammar92 May 6, 2025
299db50
Update GitHub actions (#4434)
ammar92 May 6, 2025
28f4400
Update 1.18.rst, add links to issues / bugs (#4419)
underdarknl May 6, 2025
0cc91b7
Fix weblate (#4437)
dekkers May 7, 2025
e848223
Translations update from Hosted Weblate (#4438)
weblate May 7, 2025
76e0b76
Call gc.collect() after execution of task (#4432)
dekkers May 8, 2025
e8f9b76
Fix broken image link in README.rst (#4444)
Potherca May 10, 2025
ef4b382
Translations update from Hosted Weblate (#4439)
weblate May 10, 2025
6e65e9f
Fixes for disable/enable schedule modal (#4400)
madelondohmen May 13, 2025
078d2ec
Fix boefje detail page for client member (#4409)
madelondohmen May 13, 2025
3bb3101
Open asset report from within report (#4435)
madelondohmen May 13, 2025
ff7338a
Docs - add description of origin types (#4289)
stephanie0x00 May 13, 2025
7b6ad15
Updated packages (#4453)
ammar92 May 13, 2025
744d02e
Translations update from Hosted Weblate (#4458)
weblate May 14, 2025
1535c37
Updated Django and other packages (#4441)
ammar92 May 14, 2025
7b21ac8
Integrate new octopoes endpoint
jpbruinsslot May 15, 2025
beddd9a
Update tests
jpbruinsslot May 15, 2025
b7c8c66
Update to new deduplication key
jpbruinsslot May 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions mula/scheduler/models/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ class BoefjeTask(BaseModel):
boefje: Boefje
input_ooi: str | None = None
organization: str
organizations: list[str] | None = None

dispatches: list[Normalizer] = Field(default_factory=list)

Expand Down
61 changes: 61 additions & 0 deletions mula/scheduler/schedulers/schedulers/boefje.py
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,35 @@ def push_boefje_task(
task_db.status = models.TaskStatus.FAILED
self.ctx.datastores.task_store.update_task(task_db)

# Is the ooi in other organisations? When the ooi is in other
# organisations we need to create tasks for those organisations as well.
if boefje_task.input_ooi is not None:
# FIXME: is the ooi shared between organisations, or is it a copy?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOI information is not shared between orga's. However, since they are the same (by primary key) most of their information is exactly the same. Small differences might still exists in properties that are present on the OOI but which are not used in the primary_key

# If it is a copy we need to use the ooi id in the boefje task.
orgs = self.is_ooi_in_other_organisations(boefje_task.input_ooi)
if orgs:
boefje_task.organizations = orgs

# FIXME: what if boefje is disabled in the other org? Are we sure
# that in that org the boefje is allowed to scan the ooi?
# FIXME: what status do we give it do we give it completed, we
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say a new state Deduplicated would fit the bill? We could then change it to 'complete' or 'error' once the actual running job completes/errors.

# can't make it queued since it shouldn't be picked up by
# the task runner.
# FIXME: how to these tasks get associated with the other raw files?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discusses to upload the raw files for each job, and as such we'd create normalizer jobs in each subsequent organizaton (which we might deduplicate later on).
Uploaded raw files with the same content can transparently be deduplicated (by content hash) in Bytes on disk.

for org in orgs:
boefje_task.id = uuid.uuid4()
boefje_task.organization = org

task = models.Task(
id=boefje_task.id,
scheduler_id=self.scheduler_id,
organisation=org,
status=models.TaskStatus.COMPLETED,
hash=boefje_task.hash,
data=boefje_task.model_dump(),
)
self.ctx.datastores.task_store.create_task(task)

task = models.Task(
id=boefje_task.id,
scheduler_id=self.scheduler_id,
Expand Down Expand Up @@ -685,6 +714,38 @@ def get_oois_for_boefje(self, boefje: models.Plugin, organisation: str) -> list[

return oois

def is_ooi_in_other_organisations(self, ooi: models.OOI) -> list[str] | None:
"""Check if the OOI is in other organisations.

Args:
ooi: The OOI to check.

Returns:
A list of organisations that have the same OOI.
"""
organisations = None
try:
organisations = self.ctx.services.octopoes.get_organisations_by_ooi(ooi)
except ExternalServiceError:
self.logger.exception(
"Error occurred while checking if OOI is in other organisations",
ooi_primary_key=ooi.primary_key,
scheduler_id=self.scheduler_id,
)
return None

return organisations

# TODO: implement this method
def get_organisations_for_same_task(self) -> list[str] | None:
"""Get the organisations that have the same task.

Returns:
A list of organisations that have the same task.
"""
organisations = None
return organisations

def calculate_deadline(self, schedule: models.Schedule) -> models.Schedule:
"""Override Scheduler.calculate_deadline() to calculate the deadline
for a task and based on the boefje interval."""
Expand Down
Loading