Skip to content

Orchestrator Prod Readiness : Optimise stopping on Failed jobs. #590

@heemankv

Description

@heemankv

Currently the when the orchestrator detects a Failed or VerificationTimeout job, it stops creation of ANY new job.
ANY being the key word here.

Example :

  1. Block 10 : About to create state transition job.
  2. Block 15 : Processing SnosRun.

Let's say that SnosRun for block 15 failed, it will also lead to hault the state transition job of block 10,
Which is completely unrelated.

We would Ideally only want the next new jobs to be stopped from and not the jobs previous to the one that failed.

Note : Need to ensure that the minimum job out of the multiple that failed is taken as reference.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions