Opinion on module-based publishing

In the last podcast episode, you were debating a bit, whether allowing to define output publishing from a module/process-level.

My clear opinion is that this is a bad idea:

* It's a mix of concerns. It should not be a process' (function's) concern where its output is stored. You are giving it too much responsibility.
* Additionally, allowing publishing to only be set at the workflow level ensures the modularity of processes is optimal. The same actually applies to sub-workflows in my mind. I would only consider the highest-level workflow publishing instructions and ignore publishing set in sub-workflows. That way, modules/sub-workflows can easily be used in different pipelines without requiring overrides.
* In my view, processes should be as close as possible to pure functions, such that you have a reproducible output when given a deterministic environment (container hash), same input (hash of data), and same operations (hash of code/git commit). Changing publishing behavior requires code changes, even though the operation itself is not changing.
* Flexibility in storage backends: If a module assumes certain output path capabilities that are not supported by my storage solution, I have to redefine all those publishing options.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Opinion on module-based publishing #5128

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Opinion on module-based publishing #5128

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions