-
Notifications
You must be signed in to change notification settings - Fork 32
Splits the diskio provider into reads and writes #1189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 file(s) reviewed, 4 comment(s)
Edit PR Review Bot Settings | Greptile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the implementation. Suprisingly I think I would have made it through overloading the detail_name
. This would have been far easier but would garble the column.
So although more work this implementation feels cleaner to me.
What is however not so future proof is the mixin I feel. It is very undynamic referencing self._sub_metrics_name[0]
as implied knowledge and having many hardcoded names.
I would like to bring the functionality also to the NetworkIO reporter. Then it would have to be copied almost in full or a refactoring is needed.
Also: No tests?
All in all I would say good for a merge as we need the feature. But tests and refactoring should be prioritized for the next weeks to not have lingering legacy code here.
Also nice to see that you do not write descriptions for PRs anymore, as Greptile does it :) |
Good catch with the indexes. That was a remnant from when I was trying out different things. Fixed most things. Regarding the Mixin. This is functionality that is only for the DiskIO providers. It is the same for both of them but can not be used for the network provider as it is disk specific. How would you share a function between two classes. I was thinking of creating a new class that only has this method and then extend from this. But because of the init I found this to be more confusing. Otherwise I could copy the code. Or I could import the function and then use it as a shared lib. But that also didn't look very "clean" |
* main: (22 commits) Hotfix: Compare values were 3 orders of magnitude to low due to double division (#1191) Sampling rate rework (#1194) Phase padding can now be turned on and off (#1193) User 0 should have flow_process_duration and total duration only at 30 minutes and no data in json 'measurement' AI-Tests can now activated and deactivated in tests (Testing QoL): JS errors in frontend tests are now reported typo added no-else-raise Checking in more cases now if github detected even if path broken AI Optimisations Frontend added to FOSS version as appetizer (#1192) Allow repo URLs with unknown schemes but issues warning Revert "Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API" general wording Runtime phase reconstruction only when runtime phase is present (fix): shutdown_on_job_no must only be non false (fix): Null check for resolution must also be in system_checks (fix): Providers without resolution must also be mappable to _sampling_interval_padding Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API Phase end cutoff mitigation (#1161) Guard clause that runner.run_until may never be used without a context ...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 file(s) reviewed, 4 comment(s)
Edit PR Review Bot Settings | Greptile
* main: (73 commits) Forcing int64 in pandas to be safe Splits the diskio provider into reads and writes (#1189) Sorting by created_at now Hotfix: Compare values were 3 orders of magnitude to low due to double division (#1191) Sampling rate rework (#1194) Phase padding can now be turned on and off (#1193) User 0 should have flow_process_duration and total duration only at 30 minutes and no data in json 'measurement' AI-Tests can now activated and deactivated in tests (Testing QoL): JS errors in frontend tests are now reported typo added no-else-raise Checking in more cases now if github detected even if path broken AI Optimisations Frontend added to FOSS version as appetizer (#1192) Allow repo URLs with unknown schemes but issues warning Revert "Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API" general wording Runtime phase reconstruction only when runtime phase is present (fix): shutdown_on_job_no must only be non false (fix): Null check for resolution must also be in system_checks (fix): Providers without resolution must also be mappable to _sampling_interval_padding ...
Greptile Summary
This PR splits disk I/O metrics into separate read and write measurements, requiring significant changes across the codebase to support multiple metrics per timestamp.
_parse_metrics()
methods in/metric_providers/disk/io/procfs/system/provider.py
that need to be resolvedDiskIoParseMixin
in/metric_providers/disk/io/disk_io_parse.py
is tightly coupled with hardcoded names and needs refactoring for better reusabilitymeasurement_values
needs careful consideration for data integrity