Skip to content

refactor: replace opaque CI_PROFILES_YML secret with committed Jinja2 template + Python renderer#937

Merged
haritamar merged 19 commits intomasterfrom
devin/ele-5262-1772067478
Feb 27, 2026
Merged

refactor: replace opaque CI_PROFILES_YML secret with committed Jinja2 template + Python renderer#937
haritamar merged 19 commits intomasterfrom
devin/ele-5262-1772067478

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Feb 26, 2026

refactor: replace CI_PROFILES_YML with committed Jinja2 template + Python renderer

Summary

Replaces the opaque base64-encoded CI_PROFILES_YML / CI_PROFILES_YML_FUSION GitHub secrets with a committed, reviewable Jinja2 template (profiles.yml.j2) and a lean JSON secret (CI_WAREHOUSE_SECRETS), rendered by a Python CLI script using click + Jinja2.

What changed

  1. integration_tests/profiles/profiles.yml.j2 (new) — Jinja2 template with Docker targets (postgres, clickhouse, trino, dremio) in plaintext and cloud targets (snowflake, bigquery, redshift, databricks_catalog, athena) using {{ var }} placeholders. All secret-backed credential fields use the | toyaml filter for safe YAML serialization. The elementary profile uses a Jinja for-loop to alias all targets with a _elementary schema suffix.

  2. integration_tests/profiles/generate_profiles.py (new) — Python CLI (click) that:

    • Reads the Jinja2 template
    • Optionally decodes CI_WAREHOUSE_SECRETS (base64 JSON) to populate cloud credentials
    • Uses StrictUndefined when secrets are present (fail-fast on typos) and _NullUndefined for docker-only runs (cloud placeholders resolve to empty strings)
    • Custom toyaml filter: dicts → compact inline YAML {key: val, …}, Undefined → '', everything else → pass-through
  3. test-warehouse.yml — "Write dbt profiles" step now calls generate_profiles.py instead of base64 -d | sed.

  4. test-all-warehouses.yml — Split into two job groups:

    • test-docker: Runs on pull_request only. No secrets needed, no fork approval gate. Matrix: postgres, clickhouse, trino, dremio. Uses exclude pattern to limit latest_pre to postgres only.
    • test-cloud: Keeps existing check-fork-status + approve-fork pattern. Matrix: snowflake, bigquery, redshift, databricks_catalog, athena (spark removed). Fusion variants included for snowflake, bigquery, databricks_catalog. Uses secrets: inherit.

Updates since last revision

  • Simplified _yaml_inline / toyaml filter: Reduced from a complex type-aware implementation to a simple 3-branch function (dict → compact YAML, Undefined → empty string, everything else → pass-through). The previous version handled edge cases like YAML-ambiguous strings ("yes", "null") that don't appear in real credential fields.
  • Applied | toyaml filter to all secret-backed fields in the template, not just bigquery_keyfile. This ensures dicts are serialized correctly and Undefined values render cleanly in docker-only mode.
  • Fixed matrix duplicate: Replaced include pattern (which created duplicate postgres entries when inputs.dbt-version was provided) with exclude pattern that limits latest_pre to only postgres.
  • Fixed dremio duplicate key error: The dbt-dremio adapter defines aliases (object_storage_sourcedatalake, object_storage_pathroot_path, dremio_spacedatabase, dremio_space_folderschema). Having both caused Got duplicate keys. Resolved by keeping only canonical names.
  • Fixed dremio profile: Added missing fields required by the dbt-dremio adapter and Docker Dremio setup: datalake, root_path, database.
  • Fixed bigquery profile: Added missing location: US and priority: interactive fields.
  • Added auth_type: oauth to the databricks_catalog profile (required when using client_id/client_secret).
  • Removed spark target from the workflow matrix.
  • Added isinstance(decoded, dict) validation and binascii.Error handling in the secret decoder.

CI status (latest run)

16/18 jobs pass. The 2 failures are pre-existing flaky issues unrelated to this PR:

  • latest_official/bigquery — Transient BigQuery infrastructure error: "Streaming data from ... is temporarily unavailable". The profile loads correctly; the error occurs during model execution. fusion/bigquery passes on the same commit.
  • latest_official/dremio — 1 test out of 178 fails with a seed error (test_sum_failed_row_count). 177/178 dremio tests pass, confirming the profile config is correct.

All other targets pass: postgres, clickhouse, trino, snowflake (fusion + latest_official), bigquery (fusion), redshift, databricks_catalog (fusion + latest_official), athena.

Review & Testing Checklist for Human

⚠️ The profile templates were reconstructed from CI logs and docker-compose configs — NOT decoded from the actual current secret. Dremio and BigQuery were already found to have missing fields (now fixed). Other targets may still have discrepancies.

  • Verify each cloud target profile matches the actual secret — decode the current CI_PROFILES_YML and compare field-by-field against profiles.yml.j2. Pay special attention to: snowflake (role/warehouse names?), redshift (port/dbname?), athena (s3 paths?), databricks (auth_type: oauth correct?).
  • Create the CI_WAREHOUSE_SECRETS GitHub secret — must be a base64-encoded JSON object with keys matching the template placeholders (e.g. SNOWFLAKE_ACCOUNT, BIGQUERY_KEYFILE, etc.). The BigQuery keyfile should be a nested JSON object (not a string).
  • Coordinate secret migration — this PR will break CI until CI_WAREHOUSE_SECRETS is created. Consider creating the new secret first, then merging this PR, then deleting the old CI_PROFILES_YML / CI_PROFILES_YML_FUSION secrets.
  • Test the trigger split logic — verify that:
    • Internal PRs run both test-docker (via pull_request) and test-cloud (via pull_request).
    • Fork PRs run test-docker (via pull_request) and test-cloud (via pull_request_target with approval).
    • test-docker doesn't run twice for internal PRs (skipped on pull_request_target).
  • Verify docker tests pass without secrets — the test-docker job doesn't pass secrets: inherit, so CI_WAREHOUSE_SECRETS will be empty. Confirm that generate_profiles.py produces a valid profiles.yml with empty cloud target fields and that docker tests only reference docker targets.

Notes

  • Fusion and non-fusion runs now use the same template (no separate CI_PROFILES_YML_FUSION needed).
  • Hardcoded infrastructure values: location: US, priority: interactive (BigQuery), datalake: S3Source, database: elementary_ci (Dremio). If these ever change, the template must be updated.
  • The dremio profile uses canonical dbt-dremio field names only (datalake, root_path, database, schema) — no aliases. If the adapter's field mapping changes, this may need updating.
  • The _yaml_inline function returns non-string values for pass-through cases (ints, bools, strings). This is intentional — Jinja2 handles the type coercion correctly when inserting into the YAML template.

Link to Devin run: https://app.devin.ai/sessions/fe5ef28048b84b139dfdd9c9d15b2765
Requested by: @haritamar
Resolves: ELE-5262

Summary by CodeRabbit

  • Chores
    • Added Docker-based CI job and renamed main cloud test job; added secret inheritance for cloud tests.
    • Introduced fork PR detection and an approval gate to control CI execution.
    • Narrowed warehouse test matrix to supported databases: Snowflake, BigQuery, Redshift, Databricks Catalog, Athena.
    • Reworked dbt profiles generation with a new Python CLI script using Jinja2 templates and a standardized profiles template.

…te + envsubst

- Add profiles.yml.template with plaintext docker targets and ${VAR} placeholders for cloud targets
- Update test-warehouse.yml to use CI_WAREHOUSE_SECRETS JSON secret with envsubst substitution
- Split test-all-warehouses.yml into test-docker (pull_request, no secrets/approval) and test-cloud (with fork approval gate)
- BigQuery uses keyfile path instead of inline keyfile_json
- Guard secret extraction behind 'if [ -n "$SECRETS_JSON" ]' for fork PRs

Resolves: ELE-5262
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@linear
Copy link

linear bot commented Feb 26, 2026

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link

coderabbitai bot commented Feb 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a docker-based test job, renames and narrows cloud test jobs with fork detection/approval gates, and replaces inline base64 profile generation with a Jinja2-templated Python renderer that consumes CI secrets to produce dbt profiles for CI runs.

Changes

Cohort / File(s) Summary
GitHub Actions workflow
.github/workflows/test-all-warehouses.yml
Added public test-docker job; renamed testtest-cloud; added check-fork-status (outputs is_fork, should_skip) and approve-fork gates; narrowed test-cloud warehouse matrix and added fusion include entries; adjusted dbt-data-reliability-ref handling and added secrets: inherit.
CI profile step
.github/workflows/test-warehouse.yml
Replaced inline/base64 PROFILES_YML handling with invocation of generate_profiles.py; switched to CI_WAREHOUSE_SECRETS-driven rendering and removed manual mkdir/base64 decode steps.
Profile generation tooling & template
integration_tests/profiles/generate_profiles.py, integration_tests/profiles/profiles.yml.j2
Added CLI script to render Jinja2 profiles.yml from a template with optional base64 JSON secrets (env var default CI_WAREHOUSE_SECRETS), custom Undefined behavior and yaml-inlining filter; added Jinja2 template defining local and cloud targets (postgres, clickhouse, trino, dremio, snowflake, bigquery, redshift, databricks_catalog, spark, athena) and an elementary profile variant.

Sequence Diagram(s)

sequenceDiagram
    participant GH as GitHub Actions
    participant Workflow as Workflow Jobs
    participant Script as generate_profiles.py
    participant Template as profiles.yml.j2
    participant Secrets as CI Secrets

    GH->>Workflow: trigger (push / PR / PR_target)
    Workflow->>Workflow: run check-fork-status
    alt fork approved
        Workflow->>Workflow: approve-fork gate passes
    else fork skipped
        Workflow->>GH: skip cloud jobs
    end
    Workflow->>Secrets: read CI_WAREHOUSE_SECRETS (env)
    Workflow->>Script: run generate_profiles.py --template Template --output ~/.dbt/profiles.yml --schema-name ...
    Script->>Secrets: decode base64 JSON (if present)
    Script->>Template: render Jinja2 with context
    Script->>Workflow: write rendered profiles.yml
    Workflow->>Workflow: run test-docker or test-cloud jobs using generated profiles
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble templates, secrets in a heap,
CI forks tip-toe while jobs wake from sleep,
Python stitches profiles with a Jinja2 cheer,
Docker hops in, cloud tests now appear,
A little rabbit clap — the pipeline's clear!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: replacing an opaque CI_PROFILES_YML secret with a committed Jinja2 template and Python renderer, which matches the primary refactoring described across multiple files.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/ele-5262-1772067478

Comment @coderabbitai help to get the list of available commands and usage tips.

…ility

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
.github/workflows/test-warehouse.yml (1)

135-136: Harden BigQuery keyfile permissions.

The keyfile is written to a predictable /tmp path with default umask. Use mktemp and chmod 600 to minimize exposure.

🔐 Proposed hardening
-            echo "$DECODED" | jq -r '.BIGQUERY_KEYFILE' > /tmp/bigquery_keyfile.json
-            export BIGQUERY_KEYFILE_PATH=/tmp/bigquery_keyfile.json
+            BIGQUERY_KEYFILE_PATH="$(mktemp /tmp/bigquery_keyfile.XXXXXX.json)"
+            echo "$DECODED" | jq -r '.BIGQUERY_KEYFILE' > "$BIGQUERY_KEYFILE_PATH"
+            chmod 600 "$BIGQUERY_KEYFILE_PATH"
+            export BIGQUERY_KEYFILE_PATH
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml around lines 135 - 136, Replace the
predictable /tmp path and default permissions with a secure temporary file:
create a temp file via mktemp, write the decoded BIGQUERY_KEYFILE into that temp
file (using the current jq pipeline), set strict permissions with chmod 600 on
that temp file, and then export BIGQUERY_KEYFILE_PATH pointing to the
mktemp-created file so the rest of the workflow uses the secured keyfile.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/test-all-warehouses.yml:
- Line 116: The CI matrix omits "spark" from the warehouse-type array used by
the test-all-warehouses workflow; add "spark" to the array [snowflake, bigquery,
redshift, databricks_catalog, athena] so it becomes [snowflake, bigquery,
redshift, databricks_catalog, athena, spark], ensuring Spark receives CI
coverage consistent with the conditional logic in test-warehouse.yml and the
profiles.yml.template definitions.

In @.github/workflows/test-warehouse.yml:
- Line 137: The current step uses eval on DECODED which allows shell injection
via malicious JSON keys; replace the eval pipeline so it extracts key/value
pairs from DECODED (the variable DECODED and the current jq expression),
validate each key against the identifier regex /^[A-Z_][A-Z0-9_]*$/ and only
export keys that pass, and perform the export without eval (e.g., iterate over
jq output lines and call export with the validated key and properly quoted
value). Ensure you reference DECODED and the jq extraction in your change and
reject or skip any keys that fail the regex.

---

Nitpick comments:
In @.github/workflows/test-warehouse.yml:
- Around line 135-136: Replace the predictable /tmp path and default permissions
with a secure temporary file: create a temp file via mktemp, write the decoded
BIGQUERY_KEYFILE into that temp file (using the current jq pipeline), set strict
permissions with chmod 600 on that temp file, and then export
BIGQUERY_KEYFILE_PATH pointing to the mktemp-created file so the rest of the
workflow uses the secured keyfile.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f0307f3 and 7523c80.

📒 Files selected for processing (3)
  • .github/workflows/test-all-warehouses.yml
  • .github/workflows/test-warehouse.yml
  • integration_tests/profiles/profiles.yml.template

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
.github/workflows/test-warehouse.yml (2)

126-127: Simplify SECRETS_JSON expression.

The format('{0}', secrets.CI_WAREHOUSE_SECRETS) is redundant—it just converts the secret to a string, which it already is. A direct reference with fallback is cleaner.

♻️ Proposed simplification
           # New lean JSON secret (base64 encoded)
-          SECRETS_JSON: ${{ secrets.CI_WAREHOUSE_SECRETS && format('{0}', secrets.CI_WAREHOUSE_SECRETS) || '' }}
+          SECRETS_JSON: ${{ secrets.CI_WAREHOUSE_SECRETS || '' }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml around lines 126 - 127, The
SECRETS_JSON environment expression is unnecessarily using format('{0}', ...) to
coerce the secret to a string; update the SECRETS_JSON assignment to directly
reference secrets.CI_WAREHOUSE_SECRETS with a fallback (i.e., use the
OR/fallback expression) so SECRETS_JSON: ${{ secrets.CI_WAREHOUSE_SECRETS || ''
}} (replace the current format(...) usage in the SECRETS_JSON line).

163-166: Fork PR path relies on undefined environment variables.

When neither SECRETS_JSON nor PROFILES_YML is present (fork PRs), envsubst runs but only SCHEMA_NAME is exported. Cloud target placeholders (e.g., ${SNOWFLAKE_PASSWORD}) will be replaced with empty strings, producing invalid YAML for those targets. This is fine if fork PRs only run docker targets, but the comment should clarify this limitation.

Consider either:

  1. Using envsubst with an explicit variable list to only substitute known-safe variables, or
  2. Adding a comment clarifying that only docker targets are functional in this path.
📝 Option 1: Explicit variable substitution
             # Fork PRs: no secrets available, but docker targets are still runnable.
-            envsubst < "${{ github.workspace }}/dbt-data-reliability/integration_tests/profiles/profiles.yml.template" > ~/.dbt/profiles.yml
+            # Only substitute SCHEMA_NAME; leave other placeholders intact to avoid empty credentials
+            envsubst '$SCHEMA_NAME' < "${{ github.workspace }}/dbt-data-reliability/integration_tests/profiles/profiles.yml.template" > ~/.dbt/profiles.yml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-warehouse.yml around lines 163 - 166, The fork-PR
path runs envsubst against the profiles template but only exports SCHEMA_NAME,
causing cloud-target placeholders like ${SNOWFLAKE_PASSWORD} to be blank; update
the workflow to either (A) call envsubst with an explicit variable list (e.g.,
envsubst '$SCHEMA_NAME $SAFE_VAR2 ...') so only known-safe variables are
replaced, or (B) add a clear comment near the envsubst/else block stating that
this fork PR branch only produces valid docker targets and cloud targets will be
invalid because secrets (SECRETS_JSON/PROFILES_YML) are not available; change
references: envsubst, SCHEMA_NAME, SECRETS_JSON, PROFILES_YML, and the fork PR
path.
.github/workflows/test-all-warehouses.yml (1)

117-127: Verify fusion dbt-version entries don't duplicate base matrix.

Similar to test-docker, when inputs.dbt-version is "fusion", the include entries for snowflake/bigquery/databricks_catalog will match existing base matrix entries, potentially causing duplicate workflow runs.

However, if the intent is to always run fusion in addition to the base version for these warehouses, this is correct behavior. Consider adding a brief comment clarifying whether fusion is meant to be additive or conditional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-all-warehouses.yml around lines 117 - 127, The matrix
include entries use inputs.dbt-version || 'fusion' for warehouse-type values
(snowflake, bigquery, databricks_catalog) which can duplicate base matrix runs
when inputs.dbt-version == "fusion"; either (A) make the includes conditional so
they only add the fusion entries when inputs.dbt-version != 'fusion' (e.g.,
guard the include with a conditional expression or separate include for
fusion-only) or (B) if fusion is intentionally additive, add a clarifying
comment above the matrix include explaining that inputs.dbt-version "fusion"
should run in addition to the base matrix; update references to
inputs.dbt-version and the matrix include entries (warehouse-type: snowflake,
bigquery, databricks_catalog) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/test-all-warehouses.yml:
- Around line 50-52: The include block unconditionally adds a postgres entry
which duplicates the base matrix when inputs.dbt-version is provided; remove the
include and instead set the base matrix's dbt-version value to use the fallback
expression (dbt-version: ${{ inputs.dbt-version || 'latest_pre' }}) so the
matrix only contains a single postgres entry and only uses "latest_pre" when no
input was supplied; update references to include, dbt-version, and
inputs.dbt-version accordingly.

In @.github/workflows/test-warehouse.yml:
- Around line 144-147: The script blindly writes the jq result of
'.BIGQUERY_KEYFILE' into BIGQUERY_KEYFILE_PATH which will contain the literal
"null" if the key is absent; change the flow to first extract the key into a
variable (e.g., capture jq -r '.BIGQUERY_KEYFILE' into a shell var), check that
the var is non-empty and not "null" before creating/writing the temp file and
setting BIGQUERY_KEYFILE_PATH, and only export BIGQUERY_KEYFILE_PATH when the
key is present; reference the existing symbols BIGQUERY_KEYFILE_PATH, DECODED,
and the jq invocation to locate and update the logic.

---

Nitpick comments:
In @.github/workflows/test-all-warehouses.yml:
- Around line 117-127: The matrix include entries use inputs.dbt-version ||
'fusion' for warehouse-type values (snowflake, bigquery, databricks_catalog)
which can duplicate base matrix runs when inputs.dbt-version == "fusion"; either
(A) make the includes conditional so they only add the fusion entries when
inputs.dbt-version != 'fusion' (e.g., guard the include with a conditional
expression or separate include for fusion-only) or (B) if fusion is
intentionally additive, add a clarifying comment above the matrix include
explaining that inputs.dbt-version "fusion" should run in addition to the base
matrix; update references to inputs.dbt-version and the matrix include entries
(warehouse-type: snowflake, bigquery, databricks_catalog) accordingly.

In @.github/workflows/test-warehouse.yml:
- Around line 126-127: The SECRETS_JSON environment expression is unnecessarily
using format('{0}', ...) to coerce the secret to a string; update the
SECRETS_JSON assignment to directly reference secrets.CI_WAREHOUSE_SECRETS with
a fallback (i.e., use the OR/fallback expression) so SECRETS_JSON: ${{
secrets.CI_WAREHOUSE_SECRETS || '' }} (replace the current format(...) usage in
the SECRETS_JSON line).
- Around line 163-166: The fork-PR path runs envsubst against the profiles
template but only exports SCHEMA_NAME, causing cloud-target placeholders like
${SNOWFLAKE_PASSWORD} to be blank; update the workflow to either (A) call
envsubst with an explicit variable list (e.g., envsubst '$SCHEMA_NAME $SAFE_VAR2
...') so only known-safe variables are replaced, or (B) add a clear comment near
the envsubst/else block stating that this fork PR branch only produces valid
docker targets and cloud targets will be invalid because secrets
(SECRETS_JSON/PROFILES_YML) are not available; change references: envsubst,
SCHEMA_NAME, SECRETS_JSON, PROFILES_YML, and the fork PR path.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7523c80 and 4810531.

📒 Files selected for processing (3)
  • .github/workflows/test-all-warehouses.yml
  • .github/workflows/test-warehouse.yml
  • integration_tests/profiles/profiles.yml.template
🚧 Files skipped from review as they are similar to previous changes (1)
  • integration_tests/profiles/profiles.yml.template

…CRETS_JSON, explicit envsubst for fork path, fusion matrix comment

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/test-all-warehouses.yml (1)

117-124: ⚠️ Potential issue | 🟡 Minor

Fusion include expression does not always add Fusion runs.

"${{ inputs.dbt-version || 'fusion' }}" resolves to the input version when provided, so these rows won’t schedule fusion in that case.

🔧 Proposed fix
-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: "fusion"
             warehouse-type: snowflake
-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: "fusion"
             warehouse-type: bigquery
...
-          - dbt-version: "${{ inputs.dbt-version || 'fusion' }}"
+          - dbt-version: "fusion"
             warehouse-type: databricks_catalog

Also applies to: 129-130

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-all-warehouses.yml around lines 117 - 124, The
include entries intended to always schedule Fusion runs currently use
dbt-version: "${{ inputs.dbt-version || 'fusion' }}" which picks the provided
input and thus doesn't force a 'fusion' run; update the two include entries (the
ones with warehouse-type: snowflake and warehouse-type: bigquery) to set
dbt-version to the literal 'fusion' (e.g., dbt-version: 'fusion') so Fusion runs
are always added; make the same change for the duplicate include lines
referenced later (the other pair at the same pattern).
♻️ Duplicate comments (1)
.github/workflows/test-all-warehouses.yml (1)

50-52: ⚠️ Potential issue | 🟡 Minor

Postgres include can become redundant when inputs.dbt-version is provided.

With a non-empty inputs.dbt-version, this include entry resolves to the same (dbt-version, warehouse-type) combination already produced by the base matrix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/test-all-warehouses.yml around lines 50 - 52, The matrix
include entry that sets dbt-version: "${{ inputs.dbt-version || 'latest_pre' }}"
and warehouse-type: postgres can duplicate a base matrix row when
inputs.dbt-version is provided; update the workflow so this postgres include is
only added when inputs.dbt-version is empty/unset (or otherwise differs from the
base matrix). Concretely, wrap or gate the include entry with a condition that
checks inputs.dbt-version (or refactor the matrix generation to avoid adding the
same (dbt-version, warehouse-type) pair twice) so the include for postgres is
skipped when inputs.dbt-version is supplied.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In @.github/workflows/test-all-warehouses.yml:
- Around line 117-124: The include entries intended to always schedule Fusion
runs currently use dbt-version: "${{ inputs.dbt-version || 'fusion' }}" which
picks the provided input and thus doesn't force a 'fusion' run; update the two
include entries (the ones with warehouse-type: snowflake and warehouse-type:
bigquery) to set dbt-version to the literal 'fusion' (e.g., dbt-version:
'fusion') so Fusion runs are always added; make the same change for the
duplicate include lines referenced later (the other pair at the same pattern).

---

Duplicate comments:
In @.github/workflows/test-all-warehouses.yml:
- Around line 50-52: The matrix include entry that sets dbt-version: "${{
inputs.dbt-version || 'latest_pre' }}" and warehouse-type: postgres can
duplicate a base matrix row when inputs.dbt-version is provided; update the
workflow so this postgres include is only added when inputs.dbt-version is
empty/unset (or otherwise differs from the base matrix). Concretely, wrap or
gate the include entry with a condition that checks inputs.dbt-version (or
refactor the matrix generation to avoid adding the same (dbt-version,
warehouse-type) pair twice) so the include for postgres is skipped when
inputs.dbt-version is supplied.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4810531 and 91826f1.

📒 Files selected for processing (2)
  • .github/workflows/test-all-warehouses.yml
  • .github/workflows/test-warehouse.yml

@devin-ai-integration
Copy link
Contributor Author

The fusion include expression (${{ inputs.dbt-version || 'fusion' }}) is pre-existing code from master — this PR only moved it from the old test job into the new test-cloud job and added a clarifying comment. The behavior is unchanged.

Fixing this would be a separate concern outside the scope of this refactoring PR. Happy to address it in a follow-up if needed.

…ipt + Jinja2 template

- Add generate_profiles.py (click CLI) for rendering profiles from secrets
- Convert profiles.yml.template to profiles.yml.j2 (Jinja2)
- Use service-account-json method for BigQuery (no temp keyfile needed)
- Elementary profile schema uses _elementary suffix via YAML merge keys
- Simplify test-warehouse.yml Write dbt profiles step

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…llback

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
integration_tests/profiles/generate_profiles.py (1)

17-27: Fail fast on missing secrets in cloud mode (keep null-undefined only for docker mode).

Using _NullUndefined unconditionally can hide missing/typoed secret keys. Consider switching to StrictUndefined when secrets_b64 is present.

🔧 Suggested refactor
-from jinja2 import BaseLoader, Environment, Undefined
+from jinja2 import BaseLoader, Environment, StrictUndefined, Undefined
@@
-    env = Environment(
+    undefined_cls = _NullUndefined if not secrets_b64 else StrictUndefined
+    env = Environment(
         loader=BaseLoader(),
-        undefined=_NullUndefined,
+        undefined=undefined_cls,
         keep_trailing_newline=True,
     )

Also applies to: 118-121

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@integration_tests/profiles/generate_profiles.py` around lines 17 - 27, The
template Undefined subclass _NullUndefined currently silences missing variables
for all runs; change behavior to use StrictUndefined when secrets_b64 is present
so missing/typoed secret keys fail fast. Modify the code that constructs the
Jinja environment (where _NullUndefined is used) to choose StrictUndefined if
the variable secrets_b64 (or equivalent flag) is non-empty/present, otherwise
keep using _NullUndefined for docker-only mode; also apply the same conditional
swap where _NullUndefined is used again later (the occurrences around the second
mention). Ensure you import StrictUndefined and preserve the existing
__str__/__bool__ semantics only for the docker path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration_tests/profiles/profiles.yml.j2`:
- Around line 50-55: The YAML template uses raw Jinja2 variables for
secret-backed credentials (e.g., snowflake_account, snowflake_user,
snowflake_password, snowflake_role, snowflake_database, snowflake_warehouse)
which can be mis-parsed by YAML; update the template to pipe each secret-backed
field through the toyaml filter (same approach already used for
bigquery_keyfile) so values are emitted as safe YAML scalars and avoid
boolean/number coercion—apply this change to the referenced credential blocks
and the other listed secret fields (lines for bigquery_keyfile and the ranges
noted) to ensure consistent use of | toyaml across all secrets.

---

Nitpick comments:
In `@integration_tests/profiles/generate_profiles.py`:
- Around line 17-27: The template Undefined subclass _NullUndefined currently
silences missing variables for all runs; change behavior to use StrictUndefined
when secrets_b64 is present so missing/typoed secret keys fail fast. Modify the
code that constructs the Jinja environment (where _NullUndefined is used) to
choose StrictUndefined if the variable secrets_b64 (or equivalent flag) is
non-empty/present, otherwise keep using _NullUndefined for docker-only mode;
also apply the same conditional swap where _NullUndefined is used again later
(the occurrences around the second mention). Ensure you import StrictUndefined
and preserve the existing __str__/__bool__ semantics only for the docker path.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 91826f1 and bc9ff66.

📒 Files selected for processing (3)
  • .github/workflows/test-warehouse.yml
  • integration_tests/profiles/generate_profiles.py
  • integration_tests/profiles/profiles.yml.j2

devin-ai-integration bot and others added 4 commits February 26, 2026 01:52
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
….Error

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
- dremio: add datalake, database, object_storage_source, object_storage_path, root_path, dremio_space, dremio_space_folder
- bigquery: add location (US) and priority (interactive)

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@devin-ai-integration devin-ai-integration bot changed the title refactor: replace opaque CI_PROFILES_YML secret with committed template + envsubst refactor: replace opaque CI_PROFILES_YML secret with committed Jinja2 template + Python renderer Feb 26, 2026
devin-ai-integration bot and others added 4 commits February 26, 2026 09:39
…' error

dbt-dremio maps object_storage_source->datalake, object_storage_path->root_path,
dremio_space->database, dremio_space_folder->schema. Having both causes duplicate
key errors. Keep only canonical field names.

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
… matrix entry

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
devin-ai-integration bot and others added 5 commits February 26, 2026 22:18
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…biguous strings

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…ields

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
…n toyaml

Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
Co-Authored-By: Itamar Hartstein <haritamar@gmail.com>
@haritamar haritamar merged commit 03bc6dd into master Feb 27, 2026
19 of 21 checks passed
@haritamar haritamar deleted the devin/ele-5262-1772067478 branch February 27, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant