-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Description
Problem Statement
Currently, when a pipeline has a dictionary parameter, the entire dictionary must be passed to every component, even if the component only needs a single value or subset of values. This leads to:
- Unnecessary data exposure: Components receive more data than they need
- Reduced code clarity: It's unclear which parts of the config each component uses
- Security concerns: Sensitive data may be unnecessarily exposed to components
- Tight coupling: Components become dependent on the entire config structure
Proposed Solution
Add support for extracting individual values from dictionary pipeline parameters using Pythonic dict-style syntax:
@dsl.pipeline
def my_pipeline(config: dict):
# Single-level access
component1(db_host=config['db_host'])
# Nested access
component2(host=config['database']['host'])
# Sub-dict passing
component3(db_config=config['database'])Use Cases
- Configuration Management: Pass a large config dict to the pipeline, but only extract specific values for each component
- Security: Ensure components only receive the data they need
- Code Organization: Improve clarity about what data each component uses
- Nested Configs: Handle complex nested configuration structures
Expected Behavior
- Support single-level dict access:
config['key'] - Support nested dict access:
config['level1']['level2'] - Support passing sub-dictionaries:
config['subdict'] - Generate appropriate CEL expressions at compile time
- Runtime evaluation by existing backend CEL evaluator (no backend changes needed)
Alternatives Considered
- Manual extraction in pipeline: Create intermediate variables - verbose and error-prone
- Component-level filtering: Components filter what they need - still exposes all data
- Separate parameters: Split config into many parameters - breaks encapsulation
Additional Context
This feature would leverage existing backend CEL (Common Expression Language) expression evaluation capabilities. The backend already supports parseJson(string_value)["key"] expressions, so this would be an SDK-side enhancement that generates the appropriate CEL expressions at compile time.
Implementation Notes
- Changes would be SDK-side only (Python)
- No backend changes required
- Fully backward compatible
- Compile-time transformation to CEL expressions
- Runtime type resolution via CEL evaluator
uzi0espil