Proposal: Plan Validation mechanism for TaskRunner API #1541
Replies: 4 comments 3 replies
-
As discussed offline, my suggestion would be to move |
Beta Was this translation helpful? Give feedback.
-
@theakshaypant please also add more details regarding the need for the |
Beta Was this translation helpful? Give feedback.
-
Thanks for putting together this proposal @theakshaypant. I get a feeling that plan validation is a disproportionately involved activity. I suggest limiting what we deliver first for the following reasons:
As of today, very few OpenFL features have (*) asterisks on them. It is OK in my opinion, to limit the scope of The current Overall, my recommendation is for us to be cautious of the scope of verification, and keep it as simple as possible (it is already a great start with one function that user need not worry about). We have to be cautious as a team, that |
Beta Was this translation helpful? Give feedback.
-
Thanks for the detailed clarification @theakshaypant
Overall, we both do agree on having a thin verification check. As for implementation (yaml or not), I am OK with the consensus approach here |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
In OpenFL TaskRunner API 1.8, secure aggregation was introduced with a limitation that it only worked with the
WaitForAllPolicy
straggler handling policy, and any misconfiguration wasn’t detected until runtime, leading to unnecessary reinitialization. To address this, a validation mechanism was introduced during thefx plan initialize
step, and this proposal extends that mechanism using a structured validation manifest. This file defines compatibility rules and constraints for FL plans usingtriggers
,requires
,allowed
, andforbidden
keys, enabling early detection of invalid configurations. The system supports both single-file and modular multi-file formats for maintainability, ensuring that incompatible features are flagged upfront. While it supports only equality-based and AND-composite validations at initialization, it significantly improves plan reliability and user experience by enforcing constraints early in the federated learning workflow.Motivation
In OpenFL TaskRunner API 1.8, secure aggregation was introduced with a constraint that it could only be used alongside
WaitForAllPolicy
straggler handling policy. Without any validations or verifications of the plan, if secure aggregation was enabled with any other straggler handling policy, an error would be raised only after the experiment is started; this would require the model owner to re-initialize and redistribute the plan with the appropriate value for straggler handling policy. As a stop gap measure, a planverify
method was introduced; whenfx plan initialize
command is run an error is raised if an incompatible straggler handling policy was used with secure aggregation.Extending the same for other parameters in the FL plan, a need for validation of plan was identified before its distribution such that any incompatible features are not enabled together or any other similar constraints are met early in the FL process.
High-Level Design
Technical Details
In the current implementation, when
fx plan initialize
(Ref.) is called, the validation of plan happens when the plan is parsed. The proposal retains the existing flow and targets changes solely in its implementation.validation.yaml
Given that it is hard (and long) to define all the constraints within a python script, we propose to add a
validation.yaml
file which is used to define and reference all the constraints for an FL plan.This validation manifest serves two purposes:
Similar to the
defaults
directory, we introduce aopenfl-workspace/workspace/plan/validations
directory which would store all the constraints and compatibility checks for a plan.File definition
The file is defined in a YAML format and contains a map of all validations that need to be done with the top level keys being an arbitrary string to only name the validation.
For each feature, we propose two ways to define the constraints.
uses
key word that is used to point to a reference file for definition of constraints for that featurevalidation.yaml
is ignored ifuses
keyword is present.This proposal permits the use of both definitions.
Constraint definition
Apart from the
uses
key (for a separate file), we introduce the following keys for constraint definition.triggers
: Map of triggers if met, the constraints are checked else no validation required for that feature.super_key.nested_key1.nested_key2.key1 == value2 && key2 > value3 && key2 < value4 && key3 >= value5 && key3 <= value6
..
in key indicates nested key in the plan sosuper_key.nested_key1.nested_key2.key1:
would look like this in the FL plan..
in the values does not mean anything and is matched as is with the value in plan.yaml.trigger_value
indicates the values for which the constraint check is done. Can be a string or a list.range
defines a range of values within which the constraint would be valid.requires
: List of other constraints that need to be validated when the current definition is triggered.feature_x
andfeature_y
should be set to their trigger values and their respective constraints are also met.feature_a
requires forfeature_x
andfeature_y
to be enabled, instead of duplicating the constraints underfeature_a
, the constraints defined for x and y can be referenced here.allowed
: Defines the list of keys that need to be set to certain values in order for the plan to be valid.super_key.nested_key.key1 == value1 AND (key2 == value2 OR key2 == value3.value4)
.forbidden
: Defines the list of keys that cannot be set to certain values in order for the plan to be valid.allowed
in terms of validation.super_key.nested_key.key3 != value1 AND (key4 != value6 OR key4 != value7.value8)
.super_key.key1 > value 1 && super_key.key1 < value2 && super_key.key1 >= value3 && super_key.key1 <= value4
.Combining all the elements
For a single constraint, this is how the validation would look like with dependencies on constraints feature_x and feature_y.
Incompatible features
WaitForAllPolicy
.db_store_rounds
should be set to greater than 1.Scope
Limitations
Open Questions
validation.yaml
file.Next Steps
verify
method to make it pure and add other incompatible features there.Beta Was this translation helpful? Give feedback.
All reactions