Skip to content

feat: add context to state diffing for troubleshooting, de-asyncify State Handler to hopefuly resolve some paralel execution cases (SOFIE-3889) #374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: release51
Choose a base branch
from

Conversation

jstarpl
Copy link
Contributor

@jstarpl jstarpl commented Apr 16, 2025

About the Contributor

This pull request is posted on behalf of the NRK.

Type of Contribution

This is a:

Feature & Code improvement

Current Behavior

The State Handler has some functions that are async, which don't really have to be. We're seeing some strange effects on the output of TSR that seem to indicate some sort of parallel execution. This change aims to resolve that as well as adding the ability to trace context through the state diffing.

New Behavior

State Handler is more synchronous.

Testing Instructions

  • Verify that playback works predictably.

Other Information

Status

  • PR is ready to be reviewed.
  • The functionality has been tested by the author.
  • Relevant unit tests has been added / updated.
  • Relevant documentation (code comments, system documentation) has been added / updated.

nytamin and others added 5 commits January 29, 2025 08:18
@jstarpl jstarpl requested a review from a team as a code owner April 16, 2025 15:50
@codecov-commenter
Copy link

codecov-commenter commented Apr 16, 2025

Codecov Report

Attention: Patch coverage is 77.08333% with 33 lines in your changes missing coverage. Please review.

Project coverage is 51.84%. Comparing base (b96c410) to head (7e159fc).

Files with missing lines Patch % Lines
...imeline-state-resolver/src/service/stateHandler.ts 73.07% 21 Missing ⚠️
...eline-state-resolver/src/service/DeviceInstance.ts 33.33% 6 Missing ⚠️
...state-resolver/src/integrations/hyperdeck/index.ts 0.00% 4 Missing ⚠️
...state-resolver/src/integrations/sofieChef/index.ts 0.00% 2 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           release51     #374      +/-   ##
=============================================
+ Coverage      51.75%   51.84%   +0.09%     
=============================================
  Files            127      127              
  Lines          10204    10238      +34     
  Branches        2537     2542       +5     
=============================================
+ Hits            5281     5308      +27     
- Misses          4497     4504       +7     
  Partials         426      426              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jstarpl jstarpl changed the title feat: add context to state diffing for troubleshooting, de-asyncify State Handler to hopefuly resolve some paralel execution cases. feat: add context to state diffing for troubleshooting, de-asyncify State Handler to hopefuly resolve some paralel execution cases (SOFIE-3889) Apr 18, 2025
@mint-dewit
Copy link
Contributor

I have my doubts about the increased level of complexity in the State Handler class and I have my doubts about whether this addresses the core problem. I would strongly advise to keep the StateHandler class a simple control loop for ease of understanding and debugging.

I think the core issue you may be experiencing is with an integration that needs to have commands sent sequentially but is also slow at processing. The StateHandler from before this PR (i.e. release51) will not await all commands before considering a state change complete. This is fine in general so long as commands resolve fast and correctly. In the scenario I think you are experiencing the commands from the previous state change have not all been sent yet (I am thinking about a scenario where 3 - 10 commands are sent, each potentially taking 80 - 150ms to complete) when the next state change is beginning. The new commands will be sent into the Command Executor immediately and executed immediately.

Now the obvious and theoretically correct thing to do is to always ensure commands have executed entirely before marking the state change as completed. This ensures only 1 state change at a time can be executed which is the only correct thing to do. However, the reason it doesn't work that way today is because that means that 1 command not resolving at all can pause the control loop indefinitely. When I wrote this I considered that to be a worse evil than executing state changes in parallel and I more or less still stand behind that opinion (although I don't like it).

The complicated solution is to await all commands and write a timeout logic around the state changes, take into account that some state changes that occurred during the timeout may have to be skipped (this is hopefully OK because we can recalculate the diffs) and just live with the fact that we're not quite sure how the timed out command affected the device in question.

The slightly simpler solution which should solve your issue with the slow devices executing sequentially is to ensure that the Command Executor will not schedule sequential command sequences in parallel. This would probably still want some timeout logic to ensure no 1 command can stop the execution but would then simply execute the rest of the commands in sequence in an attempt to catch up. Probably good enough?

…StateChange` operates on the next-next state change, not the one at hand
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants