-
Notifications
You must be signed in to change notification settings - Fork 21
feat: add context to state diffing for troubleshooting, de-asyncify State Handler to hopefuly resolve some paralel execution cases (SOFIE-3889) #374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release51
Are you sure you want to change the base?
Conversation
… the logged command.context This is a step towards troubleshooting resolving issues
…that states are executed synchronously in order.
…ngle entryway for executing state changes Co-authored-by: Johan Nyman <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release51 #374 +/- ##
=============================================
+ Coverage 51.75% 51.84% +0.09%
=============================================
Files 127 127
Lines 10204 10238 +34
Branches 2537 2542 +5
=============================================
+ Hits 5281 5308 +27
- Misses 4497 4504 +7
Partials 426 426 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I have my doubts about the increased level of complexity in the State Handler class and I have my doubts about whether this addresses the core problem. I would strongly advise to keep the StateHandler class a simple control loop for ease of understanding and debugging. I think the core issue you may be experiencing is with an integration that needs to have commands sent sequentially but is also slow at processing. The StateHandler from before this PR (i.e. release51) will not await all commands before considering a state change complete. This is fine in general so long as commands resolve fast and correctly. In the scenario I think you are experiencing the commands from the previous state change have not all been sent yet (I am thinking about a scenario where 3 - 10 commands are sent, each potentially taking 80 - 150ms to complete) when the next state change is beginning. The new commands will be sent into the Command Executor immediately and executed immediately. Now the obvious and theoretically correct thing to do is to always ensure commands have executed entirely before marking the state change as completed. This ensures only 1 state change at a time can be executed which is the only correct thing to do. However, the reason it doesn't work that way today is because that means that 1 command not resolving at all can pause the control loop indefinitely. When I wrote this I considered that to be a worse evil than executing state changes in parallel and I more or less still stand behind that opinion (although I don't like it). The complicated solution is to await all commands and write a timeout logic around the state changes, take into account that some state changes that occurred during the timeout may have to be skipped (this is hopefully OK because we can recalculate the diffs) and just live with the fact that we're not quite sure how the timed out command affected the device in question. The slightly simpler solution which should solve your issue with the slow devices executing sequentially is to ensure that the Command Executor will not schedule sequential command sequences in parallel. This would probably still want some timeout logic to ensure no 1 command can stop the execution but would then simply execute the rest of the commands in sequence in an attempt to catch up. Probably good enough? |
…StateChange` operates on the next-next state change, not the one at hand
About the Contributor
This pull request is posted on behalf of the NRK.
Type of Contribution
This is a:
Feature & Code improvement
Current Behavior
The State Handler has some functions that are async, which don't really have to be. We're seeing some strange effects on the output of TSR that seem to indicate some sort of parallel execution. This change aims to resolve that as well as adding the ability to trace context through the state diffing.
New Behavior
State Handler is more synchronous.
Testing Instructions
Other Information
Status