[WIP] DSL in Flyte 2.0 #6
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a new module to Flyte that enables compiling Python functions into executable DAGs (Directed Acyclic Graphs) using a simple, intuitive interface. The core of this system is the use of Promises to symbolically represent the flow of data between tasks, allowing for automatic dependency tracking and parallelization.
Key Components
Promises:
A
Promiseobject represents a future value produced by a task. As the user writes their workflow as a Python function, function parameters and intermediate results becomePromiseobjects. These are tracked during compilation to construct the DAG and its data dependencies.DAG Compilation (
dsl.compile):The
compilefunction takes a standard Python function (written as a workflow) and replaces its parameters withPromiseobjects. As the function executes (with Promises instead of actual values), every task call is intercepted, nodes and edges are added, and the final DAG structure is constructed usingnetworkx.DAG, Nodes, and Edges:
Testing:
Comprehensive tests demonstrate correct compilation, error detection (such as calling non-task functions), and expected node/edge counts in various workflow structures.
How Compilation Works
TaskTemplateinstances (i.e., tasks) are monkey-patched so that instead of executing, they register themselves in the DAG and return Promises for their outputs.Example Usage
Suppose you have two tasks,
addanddouble, and want to define a workflow that chains them:dsl.compilereplacesxandywith Promises.addordoubleregisters a node in the DAG, with dependencies based on Promises.compiled_dagcontains the entire workflow structure, which can be visualized or executed.Error Detection
If the workflow calls a function that is not a decorated Flyte task, compilation fails with an informative error—ensuring full compatibility and reusability.
Visualization
The DAG supports conversion to DOT/Graphviz for visualization, so users can inspect their workflow structure.