fix: replace dict.update() with fusion-compatible alternatives#954
fix: replace dict.update() with fusion-compatible alternatives#954devin-ai-integration[bot] wants to merge 11 commits intomasterfrom
Conversation
dbt-fusion preview.143+ removed the .update() method from maps in minijinja, causing 'map has no method named update' errors. This commit: - Adds a new dict_merge utility macro using setdefault-based merging - Replaces all 79 .update() calls across 39 files with: - elementary.dict_merge(base, override) for variable rebinding - pop() + setdefault() for in-place mutations (cache, context, callbacks) - setdefault() for adding new keys to shared dicts - Preserves exact override-takes-precedence semantics Co-Authored-By: Itamar Hartstein <[email protected]>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
👋 @devin-ai-integration[bot] |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces many in-place dictionary mutations (.update()) with immutable merges ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
… refs - generate_elementary_profile_args.sql: parameter is a loop variable referencing a dict in a list; set only rebinds locally, doesn't mutate the list element - create_elementary_test_table.sql: test_entry is a cache reference; need in-place mutation so cache reflects the change Co-Authored-By: Itamar Hartstein <[email protected]>
In Jinja2, {% set %} inside a for loop creates a local binding that
doesn't persist after the loop. The original .update() calls mutated
dicts in-place which IS visible outside the loop.
Fixed 10 files:
- setdefault() for new keys (column_monitoring_query, table_monitoring_query,
test_exposure_schema_validity, agate_to_dicts, run_query, to_primitive,
undefined_dict_keys_to_none)
- pop()+setdefault() for overwrites (get_user_creation_query,
create_elementary_user, enforce_project_configurations)
Co-Authored-By: Itamar Hartstein <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
macros/utils/run_queries/agate_to_dicts.sql (1)
10-18:⚠️ Potential issue | 🔴 CriticalCritical: Inner loop rebinding does not persist to outer loop scope.
The
{% set serializable_row = ... %}on line 15 is inside the inner{% for %}loop. Due to Jinja2's scoping rules, this rebinding does not updateserializable_rowin the outer loop's scope. After the inner loop completes,serializable_rowwill still be the empty{}from line 11, causing all rows to be appended as empty dictionaries.The original
serializable_row.update()worked because it mutated in-place. This requires true in-place mutation.🐛 Proposed fix using in-place mutation
{% for agate_row in rows.data %} {% set serializable_row = {} %} {% for col_name, col_value in agate_row.items() %} {% set serializable_col_value = elementary.agate_val_serialize(col_value) %} {% set serializable_col_name = col_name | lower %} - {% set serializable_row = elementary.dict_merge(serializable_row, {serializable_col_name: serializable_col_value}) %} + {% do serializable_row.setdefault(serializable_col_name, serializable_col_value) %} {% endfor %} {% do serializable_rows.append(serializable_row) %} {% endfor %}Since
setdefaultonly sets if the key doesn't exist, and each column name should be unique per row, this should work correctly. If column names could repeat and the last value should win, you'd needpop+setdefault.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macros/utils/run_queries/agate_to_dicts.sql` around lines 10 - 18, The inner loop is rebinding serializable_row instead of mutating it, so the outer scope's serializable_row remains empty; replace the re-assignment line that uses elementary.dict_merge inside the inner {% for col_name, col_value in agate_row.items() %} with an in-place mutation (e.g., call serializable_row.update(...) via a {% do %} statement) using the serializable_col_name and serializable_col_value produced by elementary.agate_val_serialize, so each column is added to the same serializable_row object before appending it to serializable_rows.macros/commands/enforce_project_configurations.sql (1)
29-77:⚠️ Potential issue | 🔴 CriticalCritical: Loop variable rebinding does not persist outside the loop in Jinja2.
In Jinja2,
{% set %}inside a{% for %}loop creates a new scope. The rebinding on lines 41, 46, 51, 58, and 70 will not update thevalidation_resultdefined on line 29. After the loop completes,validation_resultwill still be{'success': true}, causing validations to always pass.The original
dict.update()worked because it mutated the dictionary in-place rather than rebinding the variable. For this case, you need true in-place mutation usingsetdefault()or thedostatement pattern.🐛 Proposed fix using in-place mutation with pop+setdefault
{%- if enforce_owners and flattened_node.owner | length == 0 -%} {% do elementary.edr_log(node.resource_type ~ " " ~ node.name ~ " does not have an owner") %} - {% set validation_result = elementary.dict_merge(validation_result, {'success': false}) %} + {% do validation_result.pop('success', none) %} + {% do validation_result.setdefault('success', false) %} {%- endif -%} {%- if enforce_tags and flattened_node.tags | length == 0 -%} {% do elementary.edr_log(node.resource_type ~ " " ~ node.name ~ " does not have tags") %} - {% set validation_result = elementary.dict_merge(validation_result, {'success': false}) %} + {% do validation_result.pop('success', none) %} + {% do validation_result.setdefault('success', false) %} {%- endif -%} {%- if enforce_description and not flattened_node.description -%} {% do elementary.edr_log(node.resource_type ~ " " ~ node.name ~ " does not have a description") %} - {% set validation_result = elementary.dict_merge(validation_result, {'success': false}) %} + {% do validation_result.pop('success', none) %} + {% do validation_result.setdefault('success', false) %} {%- endif -%} {%- if required_meta_keys | length > 0 -%} {%- for meta_param in required_meta_keys -%} {%- if meta_param not in flattened_node.meta -%} {% do elementary.edr_log(node.resource_type ~ " " ~ node.name ~ " does not have required meta param " ~ meta_param) %} - {% set validation_result = elementary.dict_merge(validation_result, {'success': false}) %} + {% do validation_result.pop('success', none) %} + {% do validation_result.setdefault('success', false) %} {%- endif -%} {%- endfor -%} {%- endif -%} {%- if required_config_keys | length > 0 -%} {%- for config_param in required_config_keys -%} {# flattened node doesn't have a config yet, using node instead #} {% set config_dict = elementary.safe_get_with_default(node, 'config', {}) %} {%- if config_dict is not none -%} {%- if config_param not in config_dict -%} {% do elementary.edr_log(node.resource_type ~ " " ~ node.name ~ " does not have required config param " ~ config_param) %} - {% set validation_result = elementary.dict_merge(validation_result, {'success': false}) %} + {% do validation_result.pop('success', none) %} + {% do validation_result.setdefault('success', false) %} {%- endif -%} {%- endif -%} {%- endfor -%} {%- endif -%}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macros/commands/enforce_project_configurations.sql` around lines 29 - 77, The loop uses "{% set validation_result = elementary.dict_merge(...) %}" which rebinding inside the for-loop doesn't persist in Jinja2; replace each rebinding (occurrences where validation_result is set to elementary.dict_merge on the lines updating success) with an in-place mutation call such as "{% do validation_result.update({'success': false}) %}" (or another in-place mutator like setdefault/pop+setdefault) so the shared validation_result dict defined at the top is actually mutated; update the occurrences that reference validation_result in the blocks checking enforce_owners, enforce_tags, enforce_description, required_meta_keys, and required_config_keys.
🧹 Nitpick comments (2)
macros/utils/dict_utils/undefined_dict_keys_to_none.sql (1)
4-4: Avoid full-dict merge inside the per-key loop.Line 4 rebuilds the entire dict on each undefined key. Prefer in-place key replacement here for lower overhead.
♻️ Proposed refactor
- {% set dict = elementary.dict_merge(dict, {key: none}) %} + {% do dict.pop(key, none) %} + {% do dict.setdefault(key, none) %}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macros/utils/dict_utils/undefined_dict_keys_to_none.sql` at line 4, The current per-key code uses elementary.dict_merge and recreates the whole dict each iteration ({% set dict = elementary.dict_merge(dict, {key: none}) %}); change it to set the single key in-place instead (e.g., replace the dict_merge call with a single-key setter such as elementary.dict_set(dict, key, none) or direct assignment dict[key] = none if supported) so you avoid rebuilding the entire dict on every undefined key and reduce overhead.macros/utils/data_types/to_primitive.sql (1)
8-8: Consider avoiding per-item re-merge in recursive mapping conversion.Line 8 repeatedly recreates dictionaries during recursion. For larger nested maps, this can add avoidable compile-time overhead.
Suggested low-overhead equivalent
- {% set new_dict = elementary.dict_merge(new_dict, {k: elementary.to_primitive(v)}) %} + {% do new_dict.pop(k, none) %} + {% do new_dict.setdefault(k, elementary.to_primitive(v)) %}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macros/utils/data_types/to_primitive.sql` at line 8, The code is recreating new_dict on each iteration via elementary.dict_merge(...) which is expensive for deep/large maps; instead accumulate entries and merge once or assign directly: build a small local map (e.g., temp_map) of {k: elementary.to_primitive(v)} for each loop iteration and perform a single elementary.dict_merge(new_dict, temp_map) after the loop (or use a single update/assignment operation if Jinja supports it) to avoid repeatedly recreating new_dict; refer to new_dict, elementary.dict_merge, elementary.to_primitive, k and v when updating the loop implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@macros/edr/tests/test_utils/create_elementary_test_table.sql`:
- Line 18: The code uses test_entry.setdefault(table_type, temp_table_relation)
which prevents replacing an existing cached relation and keeps stale entries;
change this to explicitly assign or update the dict so the cached value is
overwritten (e.g., replace the setdefault call with an explicit assignment to
test_entry[table_type] = temp_table_relation or use
test_entry.update({table_type: temp_table_relation}) so test_entry, table_type,
and temp_table_relation are updated correctly).
---
Outside diff comments:
In `@macros/commands/enforce_project_configurations.sql`:
- Around line 29-77: The loop uses "{% set validation_result =
elementary.dict_merge(...) %}" which rebinding inside the for-loop doesn't
persist in Jinja2; replace each rebinding (occurrences where validation_result
is set to elementary.dict_merge on the lines updating success) with an in-place
mutation call such as "{% do validation_result.update({'success': false}) %}"
(or another in-place mutator like setdefault/pop+setdefault) so the shared
validation_result dict defined at the top is actually mutated; update the
occurrences that reference validation_result in the blocks checking
enforce_owners, enforce_tags, enforce_description, required_meta_keys, and
required_config_keys.
In `@macros/utils/run_queries/agate_to_dicts.sql`:
- Around line 10-18: The inner loop is rebinding serializable_row instead of
mutating it, so the outer scope's serializable_row remains empty; replace the
re-assignment line that uses elementary.dict_merge inside the inner {% for
col_name, col_value in agate_row.items() %} with an in-place mutation (e.g.,
call serializable_row.update(...) via a {% do %} statement) using the
serializable_col_name and serializable_col_value produced by
elementary.agate_val_serialize, so each column is added to the same
serializable_row object before appending it to serializable_rows.
---
Nitpick comments:
In `@macros/utils/data_types/to_primitive.sql`:
- Line 8: The code is recreating new_dict on each iteration via
elementary.dict_merge(...) which is expensive for deep/large maps; instead
accumulate entries and merge once or assign directly: build a small local map
(e.g., temp_map) of {k: elementary.to_primitive(v)} for each loop iteration and
perform a single elementary.dict_merge(new_dict, temp_map) after the loop (or
use a single update/assignment operation if Jinja supports it) to avoid
repeatedly recreating new_dict; refer to new_dict, elementary.dict_merge,
elementary.to_primitive, k and v when updating the loop implementation.
In `@macros/utils/dict_utils/undefined_dict_keys_to_none.sql`:
- Line 4: The current per-key code uses elementary.dict_merge and recreates the
whole dict each iteration ({% set dict = elementary.dict_merge(dict, {key:
none}) %}); change it to set the single key in-place instead (e.g., replace the
dict_merge call with a single-key setter such as elementary.dict_set(dict, key,
none) or direct assignment dict[key] = none if supported) so you avoid
rebuilding the entire dict on every undefined key and reduce overhead.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (40)
macros/commands/create_elementary_user.sqlmacros/commands/enforce_project_configurations.sqlmacros/commands/generate_json_schema_test.sqlmacros/commands/generate_schema_baseline_test.sqlmacros/edr/data_monitoring/anomaly_detection/store_anomaly_test_results.sqlmacros/edr/data_monitoring/monitors_query/column_monitoring_query.sqlmacros/edr/data_monitoring/monitors_query/table_monitoring_query.sqlmacros/edr/data_monitoring/schema_changes/store_schema_test_results.sqlmacros/edr/dbt_artifacts/upload_dbt_artifacts.sqlmacros/edr/dbt_artifacts/upload_dbt_columns.sqlmacros/edr/dbt_artifacts/upload_dbt_exposures.sqlmacros/edr/dbt_artifacts/upload_dbt_groups.sqlmacros/edr/dbt_artifacts/upload_dbt_invocation.sqlmacros/edr/dbt_artifacts/upload_dbt_metrics.sqlmacros/edr/dbt_artifacts/upload_dbt_models.sqlmacros/edr/dbt_artifacts/upload_dbt_seeds.sqlmacros/edr/dbt_artifacts/upload_dbt_sources.sqlmacros/edr/dbt_artifacts/upload_dbt_tests.sqlmacros/edr/dbt_artifacts/upload_run_results.sqlmacros/edr/dbt_artifacts/upload_source_freshness.sqlmacros/edr/materializations/test/test.sqlmacros/edr/metadata_collection/get_metric_properties.sqlmacros/edr/system/system_utils/get_config_var.sqlmacros/edr/system/system_utils/logs.sqlmacros/edr/tests/on_run_end/handle_tests_results.sqlmacros/edr/tests/python.sqlmacros/edr/tests/test_configuration/get_detection_delay.sqlmacros/edr/tests/test_configuration/get_time_bucket.sqlmacros/edr/tests/test_exposure_schema_validity.sqlmacros/edr/tests/test_utils/create_elementary_test_table.sqlmacros/utils/cross_db_utils/generate_elementary_profile_args.sqlmacros/utils/cross_db_utils/get_user_creation_query.sqlmacros/utils/data_types/to_primitive.sqlmacros/utils/dict_utils/dict_merge.sqlmacros/utils/dict_utils/undefined_dict_keys_to_none.sqlmacros/utils/graph/cache.sqlmacros/utils/graph/get_elementary_config_from_node.sqlmacros/utils/graph/get_node_meta.sqlmacros/utils/run_queries/agate_to_dicts.sqlmacros/utils/run_queries/run_query.sql
…_table Co-Authored-By: Itamar Hartstein <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
macros/commands/enforce_project_configurations.sql (1)
39-75: Consider extracting the repeatedpop()+setdefault()pattern.The same two-line pattern for setting
validation_result['success']tofalseappears 5 times. A small helper macro could reduce repetition and make intent clearer:{%- macro set_validation_failed(validation_result) -%} {% do validation_result.pop('success', none) %} {% do validation_result.setdefault('success', false) %} {%- endmacro -%}Then each validation failure block becomes:
{% do elementary.set_validation_failed(validation_result) %}This is optional since the current approach works correctly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macros/commands/enforce_project_configurations.sql` around lines 39 - 75, The template repeats the two-line pattern that clears and sets validation_result['success'] to false; extract that into a small macro (e.g., set_validation_failed) and replace each repeated pair with a single call to that macro to reduce duplication and clarify intent; locate the repeated occurrences around the validation blocks referencing validation_result (inside the enforce_owners, enforce_tags, enforce_description, required_meta_keys loop, and required_config_keys loop) and add the macro to the same file or an included macros file (name it e.g., set_validation_failed and call via elementary.set_validation_failed or just set_validation_failed depending on scope).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@macros/commands/create_elementary_user.sql`:
- Around line 24-27: The template uses unsupported dict methods `.pop()` and
`.setdefault()` on `parameter_values` inside the loop over `profile_parameters`
which will fail in MiniJinja; replace the in-place mutation with a functional
rebuild of `parameter_values` by creating a new dict that merges the original
`parameter_values` with the selected parameters from `profile_parameters` (use
`parameter["name"]` as the key and `parameter["value"]` as the override), or
alternatively implement a custom filter in the host (Rust) code to perform the
mutation; update references around `profile_parameters` and `parameter_values`
so the template uses the new merged dict instead of relying on
`.pop()`/`.setdefault()`.
In `@macros/edr/tests/test_utils/create_elementary_test_table.sql`:
- Line 18: The template calls test_entry.pop(table_type, none) which fails
because MiniJinja in dbt-fusion doesn't support dict.pop; remove that mutation
and either strip the key before handing data to the template or change the
template to skip the key when using test_entry (e.g., when iterating
test_entry.items(), ignore entries where key == table_type) so you no longer
rely on pop(table_type, none) inside the template.
In `@macros/utils/run_queries/agate_to_dicts.sql`:
- Line 15: The current use of serializable_row.setdefault(serializable_col_name,
serializable_col_value) creates first-write-wins semantics and breaks previous
overwrite behavior when column names collide after lowercasing; update the logic
to remove any existing key first and then set the value so later columns
overwrite earlier ones (i.e., call serializable_row.pop(serializable_col_name,
None) before using setdefault with serializable_col_name and
serializable_col_value), targeting the
serializable_row/serializable_col_name/serializable_col_value variables in this
macro.
---
Nitpick comments:
In `@macros/commands/enforce_project_configurations.sql`:
- Around line 39-75: The template repeats the two-line pattern that clears and
sets validation_result['success'] to false; extract that into a small macro
(e.g., set_validation_failed) and replace each repeated pair with a single call
to that macro to reduce duplication and clarify intent; locate the repeated
occurrences around the validation blocks referencing validation_result (inside
the enforce_owners, enforce_tags, enforce_description, required_meta_keys loop,
and required_config_keys loop) and add the macro to the same file or an included
macros file (name it e.g., set_validation_failed and call via
elementary.set_validation_failed or just set_validation_failed depending on
scope).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (11)
macros/commands/create_elementary_user.sqlmacros/commands/enforce_project_configurations.sqlmacros/edr/data_monitoring/monitors_query/column_monitoring_query.sqlmacros/edr/data_monitoring/monitors_query/table_monitoring_query.sqlmacros/edr/tests/test_exposure_schema_validity.sqlmacros/edr/tests/test_utils/create_elementary_test_table.sqlmacros/utils/cross_db_utils/get_user_creation_query.sqlmacros/utils/data_types/to_primitive.sqlmacros/utils/dict_utils/undefined_dict_keys_to_none.sqlmacros/utils/run_queries/agate_to_dicts.sqlmacros/utils/run_queries/run_query.sql
🚧 Files skipped from review as they are similar to previous changes (4)
- macros/edr/data_monitoring/monitors_query/table_monitoring_query.sql
- macros/edr/data_monitoring/monitors_query/column_monitoring_query.sql
- macros/utils/cross_db_utils/get_user_creation_query.sql
- macros/utils/dict_utils/undefined_dict_keys_to_none.sql
| {% for parameter in profile_parameters %} | ||
| {% do parameter_values.update({parameter["name"]: parameter["value"]}) %} | ||
| {% do parameter_values.pop(parameter["name"], none) %} | ||
| {% do parameter_values.setdefault(parameter["name"], parameter["value"]) %} | ||
| {% endfor %} |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
dbt-fusion minijinja map pop method support
💡 Result:
In dbt-fusion, templating is implemented via dbt-jinja (a Rust extension of MiniJinja), so MiniJinja’s built-ins determine what’s available. [1]
map: Supported as a built-in filter in MiniJinja (e.g.users | map(attribute="username")). [2]pop: Not supported as a list/dict method in MiniJinja in the general case—MiniJinja does not expose builtindict/listmethods “in all cases”, so relying on.pop()is not portable/expected to work. [3]
Sources: [1] [2] [3]
🌐 Web query:
MiniJinja dict operations available methods documentation
💡 Result:
MiniJinja generally does not expose “dict methods” (like Python/Jinja2’s mydict.items() etc.) on MiniJinja-native maps; instead you use built-in filters / globals for common mapping operations. In the Python bindings specifically, a Python dict passed into MiniJinja may expose some methods, but a map created inside MiniJinja (e.g. via dict(...)) will not necessarily have .items() as a method. [3]
Create / merge a dict (global function)
dict(...): create a mapping (alternative to a dict literal), and can also merge an existing mapping with extra keyword args. [1]
Example:
{% set cfg = dict(DEBUG=true, API_URL_PREFIX="/api") %}
{% set merged = dict(cfg, extra_value=2) %}[1]
Iterate / transform dicts (filters)
With builtins enabled, MiniJinja provides filters including:
|items: returns an iterable of(key, value)pairs from a mapping (use this for{% for k, v in ... %}loops). [2]|dictsort: sort a dict for stable iteration order. [2]
Example:
{% for k, v in cfg|items %}
{{ k }} = {{ v }}
{% endfor %}
{% for k, v in cfg|dictsort %}
{{ k }} = {{ v }}
{% endfor %}[2]
🌐 Web query:
MiniJinja dict remove key update values filtering
💡 Result:
Key point: MiniJinja templates are (mostly) functional—you typically don’t mutate a dict in-place. Instead you iterate/filter it, or you do the mutation in host code (Rust/Go/etc.) via a custom filter.
1) “Remove a key” (filter it out while iterating)
{% for k, v in my_dict|items if k != "secret" %}
{{ k }} = {{ v }}
{% endfor %}MiniJinja’s items filter turns a mapping into (key, value) pairs for iteration. [1]
If you want stable ordering, sort first:
{% for k, v in my_dict|dictsort if k != "secret" %}
{{ k }} = {{ v }}
{% endfor %}dictsort is like items but sorted. [2]
2) “Update values” (transform during rendering; no in-place update)
If you just need updated output:
{% for k, v in my_dict|items %}
{{ k }} = {{ v|lower }}
{% endfor %}(Here you’re transforming v at print-time.)
If you need to transform a sequence of things, MiniJinja provides map to apply a filter or pull an attribute. [3]
3) “Filtering” (select/reject)
MiniJinja has select, reject, selectattr, rejectattr for filtering sequences. [4]
For dicts, the usual pattern is: dict → items → filter pairs → iterate.
4) If you truly need a new dict with a key removed / values updated
MiniJinja doesn’t provide a general “rebuild dict from pairs” builtin; do it in host code (recommended), or add a custom filter. MiniJinja explicitly supports custom filters via Environment::add_filter. [4]
Example (Rust) custom filter to remove a key:
use minijinja::{Environment, Value};
use std::collections::BTreeMap;
fn without_key(map: Value, key: String) -> minijinja::value::Value {
let mut out = BTreeMap::new();
if let Some(obj) = map.as_object() {
for k in obj.keys() {
if k != key {
out.insert(k.to_string(), map.get_item(&k).unwrap_or(Value::UNDEFINED));
}
}
}
Value::from(out)
}
// env.add_filter("without_key", without_key);Then in template:
{% set cleaned = my_dict|without_key("secret") %}
{% for k, v in cleaned|items %}...{% endfor %}Sources:
[1] MiniJinja items filter docs (iterate mapping as pairs) (docs.rs)
[2] MiniJinja dictsort filter docs (docs.rs)
[3] MiniJinja map filter docs/examples in source (docs.rs)
[4] MiniJinja filters overview + custom filter registration concept (docs.rs)
Citations:
- 1: https://docs.rs/axominijinja/latest/minijinja/filters/fn.items.html?utm_source=openai
- 2: https://docs.rs/minijinja/latest/x86_64-pc-windows-msvc/minijinja/filters/fn.dictsort.html
- 3: https://docs.rs/minijinja/latest/src/minijinja/filters.rs.html
- 4: https://docs.rs/minijinja/latest/minijinja/filters
Remove .pop() and .setdefault() calls—they are not supported in MiniJinja.
MiniJinja does not expose dict methods (e.g. .pop(), .setdefault(), .items()). The template engine is designed for functional transforms, not in-place mutation. To update parameter_values with selected parameters, either:
- Rebuild the dict via
dict()with filtered/merged values (if the logic allows), or - Implement a custom filter in the dbt-fusion host code (Rust) to perform the dict mutation.
The current code will fail at runtime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@macros/commands/create_elementary_user.sql` around lines 24 - 27, The
template uses unsupported dict methods `.pop()` and `.setdefault()` on
`parameter_values` inside the loop over `profile_parameters` which will fail in
MiniJinja; replace the in-place mutation with a functional rebuild of
`parameter_values` by creating a new dict that merges the original
`parameter_values` with the selected parameters from `profile_parameters` (use
`parameter["name"]` as the key and `parameter["value"]` as the override), or
alternatively implement a custom filter in the host (Rust) code to perform the
mutation; update references around `profile_parameters` and `parameter_values`
so the template uses the new merged dict instead of relying on
`.pop()`/`.setdefault()`.
There was a problem hiding this comment.
Same note as the other comment — dbt-fusion extends vanilla MiniJinja with custom dict methods. setdefault() is already used in the original codebase and works in fusion. CI will validate whether pop() is also supported. If not, I'll switch to a different pattern.
There was a problem hiding this comment.
Skipped: comment is from another GitHub bot.
| {% set serializable_col_value = elementary.agate_val_serialize(col_value) %} | ||
| {% set serializable_col_name = col_name | lower %} | ||
| {% do serializable_row.update({serializable_col_name: serializable_col_value}) %} | ||
| {% do serializable_row.setdefault(serializable_col_name, serializable_col_value) %} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 2059
🏁 Script executed:
git log --oneline -n 20 -- macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 164
🏁 Script executed:
git show 791dd81d:macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 1737
🏁 Script executed:
git show 791dd81d^:macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 121
🏁 Script executed:
git show 791dd81d -- macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 2325
🏁 Script executed:
git log --oneline --all -- macros/utils/run_queries/agate_to_dicts.sql | head -5Repository: elementary-data/dbt-data-reliability
Length of output: 333
🏁 Script executed:
git show 783928fb:macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 1735
🏁 Script executed:
git log -1 --format="%B" 791dd81dRepository: elementary-data/dbt-data-reliability
Length of output: 212
🏁 Script executed:
git log -p 1225ddba -- macros/utils/run_queries/agate_to_dicts.sql | head -80Repository: elementary-data/dbt-data-reliability
Length of output: 62
🏁 Script executed:
git show 1225ddba:macros/utils/run_queries/agate_to_dicts.sqlRepository: elementary-data/dbt-data-reliability
Length of output: 751
🏁 Script executed:
git log --all --grep="pop.*setdefault" --oneline | head -5Repository: elementary-data/dbt-data-reliability
Length of output: 164
Jinja setdefault creates first-write-wins behavior; consider pop + setdefault to preserve overwrite semantics
At Line 15, setdefault only assigns a value if the key doesn't already exist. The prior update behavior overwrote all values (last-write-wins). If an agate_row has case-colliding columns after lowercasing (e.g., Foo and foo both become foo), this changes which value is retained.
The commit message for this change mentions "use pop+setdefault for cache overwrite," which would restore the overwrite semantics. If that is the intended behavior, use:
Proposed change
- {% do serializable_row.setdefault(serializable_col_name, serializable_col_value) %}
+ {% do serializable_row.pop(serializable_col_name, none) %}
+ {% do serializable_row.setdefault(serializable_col_name, serializable_col_value) %}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {% do serializable_row.setdefault(serializable_col_name, serializable_col_value) %} | |
| {% do serializable_row.pop(serializable_col_name, none) %} | |
| {% do serializable_row.setdefault(serializable_col_name, serializable_col_value) %} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@macros/utils/run_queries/agate_to_dicts.sql` at line 15, The current use of
serializable_row.setdefault(serializable_col_name, serializable_col_value)
creates first-write-wins semantics and breaks previous overwrite behavior when
column names collide after lowercasing; update the logic to remove any existing
key first and then set the value so later columns overwrite earlier ones (i.e.,
call serializable_row.pop(serializable_col_name, None) before using setdefault
with serializable_col_name and serializable_col_value), targeting the
serializable_row/serializable_col_name/serializable_col_value variables in this
macro.
There was a problem hiding this comment.
Fair point about case-colliding column names after lowercasing. However, in practice agate rows come from database query results where column names are unique pre-lowercasing, and if they do collide, it's an edge case where either behavior (first-write or last-write) is acceptable. I'll leave this as setdefault for now since it's simpler and matches the common case. If CI reveals issues, I'll add pop() before setdefault().
There was a problem hiding this comment.
Skipped: comment is from another GitHub bot.
…mings fix) Three files had dict_merge calls inside for loops, creating local bindings due to Jinja2 scoping that were lost when the loop exited: - upload_run_results.sql: timing data (execute_started_at, etc.) was lost after the for loop, causing test_timings to fail with 'assert None' - upload_source_freshness.sql: compile/execute timing data was lost - get_config_var.sql: runtime_config was empty after iterating vars Fixed by using pop()+setdefault() for in-place mutation that persists across Jinja2 scope boundaries. Co-Authored-By: Itamar Hartstein <[email protected]>
…writes Adds elementary.dict_set(dict, key, value) macro that encapsulates the pop()+setdefault() pattern for in-place key overwrites. Replaces all raw pop+setdefault pairs across 9 files with the cleaner dict_set call. Three patterns now available: - dict_set(d, k, v): set a key in-place (works inside for loops) - d.setdefault(k, v): add a key only if missing - dict_merge(a, b): create a new merged dict (use outside for loops) Co-Authored-By: Itamar Hartstein <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@macros/edr/system/system_utils/logs.sql`:
- Around line 100-103: The code uses
global_duration_context_stack.setdefault(thread_id,
[elementary.init_duration_context_dict('main')]) which doesn’t handle the case
where thread_id exists but maps to an empty list; change the logic to explicitly
initialize when the stack is missing or empty: if not thread_stack then if not
global_duration_context_stack.get(thread_id):
global_duration_context_stack[thread_id] =
[elementary.init_duration_context_dict('main')]; finally return
global_duration_context_stack.get(thread_id). This ensures thread_stack,
global_duration_context_stack, and elementary.init_duration_context_dict('main')
behave correctly and avoids the re-init regression.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (10)
macros/commands/create_elementary_user.sqlmacros/commands/enforce_project_configurations.sqlmacros/edr/dbt_artifacts/upload_run_results.sqlmacros/edr/materializations/test/test.sqlmacros/edr/system/system_utils/logs.sqlmacros/edr/tests/test_utils/create_elementary_test_table.sqlmacros/utils/cross_db_utils/generate_elementary_profile_args.sqlmacros/utils/cross_db_utils/get_user_creation_query.sqlmacros/utils/dict_utils/dict_merge.sqlmacros/utils/graph/cache.sql
🚧 Files skipped from review as they are similar to previous changes (3)
- macros/edr/dbt_artifacts/upload_run_results.sql
- macros/edr/materializations/test/test.sql
- macros/utils/cross_db_utils/get_user_creation_query.sql
CodeRabbit correctly identified that setdefault wouldn't reinitialize when thread_id maps to [] (empty list). Using dict_set ensures the same overwrite behavior as the original .update() call. Co-Authored-By: Itamar Hartstein <[email protected]>
| {% do elementary.dict_set(my_dict, 'key', value) %} | ||
| #} | ||
|
|
||
| {% macro dict_set(dict, key, value) %} |
There was a problem hiding this comment.
Should be in its own file
(I think you can let the current run finish before pushing fixes, so it will have the chance to finish)
There was a problem hiding this comment.
Will do — I'll move dict_set to its own file after the current CI run completes.
| {%- do exceptions.raise_compiler_error("ERROR: A public key must be provided to generate a Snowflake user!") -%} | ||
| {%- endif -%} | ||
| {% do parameter_values.update({"public_key": public_key}) %} | ||
| {% set parameter_values = elementary.dict_merge(parameter_values, {"public_key": public_key}) %} |
There was a problem hiding this comment.
Since the main use case is to replace update(), I think it makes sense to also create a dict_update macro that will update from another dict in-place.
(it's still fine to leave dict_merge as well, but feels to me that in this PR dict_update makes more sense?)
There was a problem hiding this comment.
Makes sense — I'll create a dict_update(target, source) macro that iterates over the source dict and calls dict_set for each key. Then replace the dict_merge + rebinding pattern with dict_update where the intent is in-place mutation. Will push after the current CI run finishes.
…ict_merge+rebinding with dict_update/dict_set across all files Co-Authored-By: Itamar Hartstein <[email protected]>
…o all callers
dbt-fusion's minijinja engine does not support .pop() on maps.
dict_update now builds a new dict via setdefault() and returns it,
so all callers must rebind: {% set x = elementary.dict_update(x, ...) %}
dict_set (pop+setdefault) is kept for Jinja2-only contexts where
true in-place mutation is needed (for-loops in on-run-end macros).
Co-Authored-By: Itamar Hartstein <[email protected]>
| {% if auth_method == "password" %} | ||
| {% if password is none %} | ||
| {% do parameter_values.update({"password": elementary.generate_password()}) %} | ||
| {% set parameter_values = elementary.dict_update(parameter_values, {"password": elementary.generate_password()}) %} |
There was a problem hiding this comment.
I think this (and others) could be simplified to:
{% do elementary.dict_update(parameter_values, {"password": elementary.generate_password()}) %}
no?
There was a problem hiding this comment.
Unfortunately no — that was actually the first approach I tried (commit 8744008d), but it caused the fusion snowflake CI failure.
The problem is: to truly mutate a dict in-place (which {% do %} requires), you need .pop() to remove the old key before .setdefault() adds the new value. But dbt-fusion's minijinja doesn't support .pop() on maps — same as .update().
So dict_update had to be rewritten to build and return a new dict (using only setdefault()), which means callers must rebind with {% set x = ... %}. The {% do %} call would just discard the returned dict.
The only macro that does true in-place mutation is dict_set (via pop()+setdefault()), but it's restricted to Jinja2-only contexts (on-run-end, artifact uploads, etc.) where pop() works. Fusion-rendered code can only use the rebinding pattern.
TL;DR: It's a minijinja limitation — no .update(), no .pop(), so no in-place dict mutation in fusion contexts.
…ge only - Remove dict_update macro (functionally identical to dict_merge) - Remove dict_set macro (uses pop() which is unavailable in fusion) - Replace all dict_update calls with dict_merge - Replace dict_set metadata_hash calls with plain setdefault() - Restructure upload_run_results.sql: remove none init, use setdefault in loop - Restructure on_run_result_query_exceed: return new dict via dict_merge - Update insert_rows to use on_query_exceed callback return value - Restructure enforce_project_configurations: list-based failure tracking - Restructure logs.sql duration accumulation: pop/append parent context - Restructure generate_elementary_profile_args: rebuild parameters list - Restructure get_user_creation_query: two-pass CATALOG-first approach - Keep pop() only in test.sql materialization context (dbt-core only) Co-Authored-By: Itamar Hartstein <[email protected]>
… (fusion-compat) Co-Authored-By: Itamar Hartstein <[email protected]>
fix: replace dict.update() with fusion-compatible alternatives
Summary
dbt-fusion's minijinja engine (somewhere between
preview.120andpreview.143) removed.update()and.pop()from maps, breaking all fusion CI tests withmap has no method named update. This PR replaces all.update()calls across 42 macro files with fusion-compatible alternatives.Approach (consolidated after iteration)
Only two primitives are used — no
pop(), noupdate(), no customdict_set/dict_updatemacros:dict_merge(base, override)(macros/utils/dict_utils/dict_merge.sql): Returns a new merged dict. Override values take precedence. Callers must rebind:{% set x = elementary.dict_merge(x, {...}) %}.setdefault(key, value): In-place, first-write-wins. Used inside for-loops (where Jinja2 scoping prevents rebinding) and for simple key additions where overwrite is unnecessary.For cases where neither pattern fits cleanly, workarounds were applied:
enforce_project_configurations.sql: Replacedvalidation_resultdict with avalidation_failureslist +.append()inside the loop, checkinglength == 0at the end.create_elementary_test_table.sql/get_elementary_test_table.sql: Changed cache from{table_type: relation}to{table_type: [relation, ...]}with.append()and[-1]retrieval, avoiding dict key overwrite.generate_elementary_profile_args.sql: Rebuilds the parameters list with new dicts instead of mutating parameter dicts in-place.get_user_creation_query.sql(Dremio): Two-pass loop — CATALOGs first sosetdefault()gives them precedence.Review & Testing Checklist for Human
Verify
setdefault()first-write-wins semantics — Files likeagate_to_dicts.sql(line 15),column_monitoring_query.sql(line 13),table_monitoring_query.sql(line 148),to_primitive.sql(line 8),run_query.sql(line 6),undefined_dict_keys_to_none.sql(line 4) now usesetdefault()instead of.update(). This changes from last-write-wins to first-write-wins. If keys can collide (e.g., column names after lowercasing inagate_to_dicts, or recursive dict processing into_primitive), the last value is silently ignored instead of overwriting. Edge case but worth testing with duplicate column names, nested dicts with overlapping keys, etc.Test cache behavior with repeated
set_cachecalls —cache.sqlusessetdefault()at line 3, so subsequentset_cache(same_entry, new_val)calls are silently ignored (first-write-wins). The original used.update()which overwrote. Verify that caching workflows don't rely on cache key overwriting.Test table cache list-based retrieval —
create_elementary_test_table.sqlnow stores relations in a list per table_type (append-only), andget_elementary_test_table.sqlreads[-1](last element). Test scenarios where multiple temp tables are created for the same test+table_type combo to confirm the last one is always retrieved correctly.Test validation logic with list-based approach —
enforce_project_configurations.sqlreplaced dict mutation with avalidation_failureslist. Verify validation failures are still caught correctly and CI/tests fail as expected when configs are invalid.Test Dremio user creation query —
get_user_creation_query.sql(Dremio) now uses a two-pass loop to prioritize CATALOGs. Verify Dremio user creation grants still work correctly, especially when a database appears both as CATALOG and SOURCE.generate_json_schema_test.sqlcreates new node dicts — This macro creates entirely newnodeandnode.configdicts via nesteddict_merge. If downstream code (dbt internals) holds references to the original node object, they won't see the updates. Test that JSON schema tests still work correctly.CI completion — Currently 19/20 passing (fusion bigquery ✅, fusion databricks_catalog ✅, all latest_official ✅). Fusion snowflake is still running (~3+ hours, unusually long). Verify it completes successfully.
Test Plan
latest_officialwarehouse tests passed.Fooandfoo→ both becomefoo) inagate_to_dictsset_cachecalls with the same entry keyNotes
.update()and.pop()from maps somewhere between preview.120 (working) and preview.143+ (broken). Local testing with preview.143 showed both methods work in simple cases, suggesting the issue is context-specific (sidecar mode, package macros, or specific dict/map types from dbt internals).error: dbt1501: Failed to render SQL unknown method: map has no method named updateatstore_anomaly_test_results.sql:25:21dict_update+dict_setmacros, discoveredpop()also unsupported in fusion, consolidated to justdict_merge+setdefault().update(),pop(), andsetdefault()on dicts with fusion preview.143 locally; all three worked fine. This confirms the error is not universal but rather tied to how dbt-fusion's sidecar mode handles package macros or specific dict types.Link to Devin Session: https://app.devin.ai/sessions/5b3e72e477aa4b0b8243c5c0d2b5182f
Requested by: @haritamar
Summary by CodeRabbit
New Features
dict_mergeutility for fusion-compatible dict merging (returns new dict, requires rebinding)Refactor
.update()) with fusion-compatible alternatives:dict_merge()+ rebinding for most casessetdefault()for simple key additions (first-write-wins semantics)Breaking Changes (Internal)
setdefault()semantics: first-write-wins instead of last-write-wins in several filesset_cache(): subsequent calls with same entry key are now ignored (first-write-wins)[-1])