-
-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container with Elasticsearch #2550
Conversation
I converted the pr to draft because the doc is missing |
restored to real pr, the doc is in another repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is important to write in the docs which elasticsearch versions are supported / have been tested
intel_owl/tasks.py
Outdated
|
||
def _convert_report_to_elastic_document(_class: AbstractReport) -> List[Dict]: | ||
upper_threshold = now().replace(second=0, microsecond=0) | ||
lower_threshold = upper_threshold - datetime.timedelta(minutes=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timedeltas should not be calculated inside async functions but should be calculated beforehand. That is to avoid that, in case of congestion, this value changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's correct to calculate it inside the task: the alternative is to put it in the beat schedule, but this doesn't work because the function is called once when the schedule is defined and the time range would be the same for all the scheduled tasks. Am i wrong ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, you right. If I remember correctly, in another occasion we managed this case by calculating the time from an element of the database. In this way there's no way of getting this value wrong because this task would change it only at the time of execution. So, if there are any downtimes, there would be no loss of data. (I am afraid of having the sync misaligned cause we lose some data from time to time. That would make data analysis really bad)
(I would still get the time from "now" first and then, instead of doing minus 5 minutes, I would use as lower_threshold the data got from the database of the last update)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I remember what are you talking about, but I didn't find collections: I found some capped collections used to repeat a task in case of failure, it's similar, but not the same. However I found a way to do it with postgres, I proceed with the merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Worth considering though. View full project report here.
(Please add to the PR name the issue/s that this PR would close if merged by using a Github keyword. Example:
<feature name>. Closes #999
. If your PR is made by a single commit, please add that clause in the commit too. This is all required to automate the closure of related issues.)Description
Please include a summary of the change and link to the related issue.
Type of change
Please delete options that are not relevant.
Checklist
develop
dumpplugin
command and added it in the project as a data migration. ("How to share a plugin with the community")test_files.zip
and you added the default tests for that mimetype in test_classes.py.FREE_TO_USE_ANALYZERS
playbook by following this guide.url
that contains this information. This is required for Health Checks._monkeypatch()
was used in its class to apply the necessary decorators.MockUpResponse
of the_monkeypatch()
method. This serves us to provide a valid sample for testing.Black
,Flake
,Isort
) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.tests
folder). All the tests (new and old ones) gave 0 errors.DeepSource
,Django Doctors
or other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.Important Rules