[WIP] DSL in Flyte 2.0 #6
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a new module to Flyte that enables compiling Python functions into executable DAGs (Directed Acyclic Graphs) using a simple, intuitive interface. The core of this system is the use of Promises to symbolically represent the flow of data between tasks, allowing for automatic dependency tracking and parallelization.
Key Components
Promises:
A
Promise
object represents a future value produced by a task. As the user writes their workflow as a Python function, function parameters and intermediate results becomePromise
objects. These are tracked during compilation to construct the DAG and its data dependencies.DAG Compilation (
dsl.compile
):The
compile
function takes a standard Python function (written as a workflow) and replaces its parameters withPromise
objects. As the function executes (with Promises instead of actual values), every task call is intercepted, nodes and edges are added, and the final DAG structure is constructed usingnetworkx
.DAG, Nodes, and Edges:
Testing:
Comprehensive tests demonstrate correct compilation, error detection (such as calling non-task functions), and expected node/edge counts in various workflow structures.
How Compilation Works
TaskTemplate
instances (i.e., tasks) are monkey-patched so that instead of executing, they register themselves in the DAG and return Promises for their outputs.Example Usage
Suppose you have two tasks,
add
anddouble
, and want to define a workflow that chains them:dsl.compile
replacesx
andy
with Promises.add
ordouble
registers a node in the DAG, with dependencies based on Promises.compiled_dag
contains the entire workflow structure, which can be visualized or executed.Error Detection
If the workflow calls a function that is not a decorated Flyte task, compilation fails with an informative error—ensuring full compatibility and reusability.
Visualization
The DAG supports conversion to DOT/Graphviz for visualization, so users can inspect their workflow structure.