Fix test result sample count differing from failure count#933
Fix test result sample count differing from failure count#933MikaKerman wants to merge 3 commits intomasterfrom
Conversation
…istic queries Truncate result_rows to match dbt's failure count when re-execution of non-deterministic queries (e.g. TABLESAMPLE) returns more rows than the original run. Also add a description note when sample_percent is used. Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
👋 @MikaKerman |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: defaults Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThese changes implement result row truncation and sampling annotations for dbt tests. A new integration test validates that result rows don't exceed reported failure counts. The test.sql macro truncates result rows to match dbt failure counts, and handle_tests_results.sql annotates descriptions when sampling parameters are present. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration_tests/tests/test_result_rows_count.py`:
- Around line 1-88: This file fails Black formatting; run Black (or your
project's pre-commit formatter) on the test module containing functions
test_result_rows_do_not_exceed_failures,
test_sample_percent_adds_description_note, and
test_no_sample_percent_no_description_note to apply the canonical formatting and
update the commit so CI's pre-commit checks pass.
In `@macros/edr/tests/on_run_end/handle_tests_results.sql`:
- Around line 66-71: The concatenation that appends the sample note to
elementary_test_results_row['test_results_description'] can introduce a leading
space when the original description is empty; change the update logic in the
block that checks test_params and sample_percent so you build the new
description without unconditional leading whitespace (e.g., compute existing =
elementary_test_results_row.get('test_results_description') or '' and then
append either (" Note: ..." if existing != '' else "Note: ...") or join
non-empty parts), and update elementary_test_results_row with that properly
trimmed string.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
integration_tests/dbt_project/macros/test_not_null_sampled.sqlintegration_tests/tests/test_result_rows_count.pymacros/edr/materializations/test/test.sqlmacros/edr/tests/on_run_end/handle_tests_results.sql
| import json | ||
|
|
||
| import pytest | ||
| from dbt_project import DbtProject | ||
|
|
||
| COLUMN_NAME = "some_column" | ||
|
|
||
| SAMPLES_QUERY = """ | ||
| with latest_elementary_test_result as ( | ||
| select id | ||
| from {{{{ ref("elementary_test_results") }}}} | ||
| where lower(table_name) = lower('{test_id}') | ||
| order by created_at desc | ||
| limit 1 | ||
| ) | ||
|
|
||
| select result_row | ||
| from {{{{ ref("test_result_rows") }}}} | ||
| where elementary_test_results_id in (select * from latest_elementary_test_result) | ||
| """ | ||
|
|
||
|
|
||
| @pytest.mark.skip_targets(["clickhouse"]) | ||
| def test_result_rows_do_not_exceed_failures(test_id: str, dbt_project: DbtProject): | ||
| """Result rows count should never exceed the dbt failure count.""" | ||
| null_count = 20 | ||
| data = [{COLUMN_NAME: None} for _ in range(null_count)] | ||
| test_result = dbt_project.test( | ||
| test_id, | ||
| "not_null", | ||
| dict(column_name=COLUMN_NAME), | ||
| data=data, | ||
| test_vars={ | ||
| "enable_elementary_test_materialization": True, | ||
| "test_sample_row_count": 1000, | ||
| }, | ||
| ) | ||
| assert test_result["status"] == "fail" | ||
|
|
||
| failures = int(test_result["failures"]) | ||
| assert failures == null_count | ||
|
|
||
| samples = [ | ||
| json.loads(row["result_row"]) | ||
| for row in dbt_project.run_query(SAMPLES_QUERY.format(test_id=test_id)) | ||
| ] | ||
| assert len(samples) <= failures | ||
|
|
||
|
|
||
| @pytest.mark.skip_targets(["clickhouse"]) | ||
| def test_sample_percent_adds_description_note(test_id: str, dbt_project: DbtProject): | ||
| """When sample_percent is in test_params, a note should be appended to test_results_description.""" | ||
| null_count = 20 | ||
| data = [{COLUMN_NAME: None} for _ in range(null_count)] | ||
| test_result = dbt_project.test( | ||
| test_id, | ||
| "not_null_sampled", | ||
| dict(column_name=COLUMN_NAME, sample_percent=10), | ||
| data=data, | ||
| test_vars={ | ||
| "enable_elementary_test_materialization": True, | ||
| "test_sample_row_count": 5, | ||
| }, | ||
| ) | ||
| assert test_result["status"] == "fail" | ||
| assert "sample_percent" in test_result["test_results_description"] | ||
|
|
||
|
|
||
| @pytest.mark.skip_targets(["clickhouse"]) | ||
| def test_no_sample_percent_no_description_note( | ||
| test_id: str, dbt_project: DbtProject | ||
| ): | ||
| """When sample_percent is NOT in test_params, no sampling note should appear.""" | ||
| null_count = 20 | ||
| data = [{COLUMN_NAME: None} for _ in range(null_count)] | ||
| test_result = dbt_project.test( | ||
| test_id, | ||
| "not_null", | ||
| dict(column_name=COLUMN_NAME), | ||
| data=data, | ||
| test_vars={ | ||
| "enable_elementary_test_materialization": True, | ||
| "test_sample_row_count": 5, | ||
| }, | ||
| ) | ||
| assert test_result["status"] == "fail" | ||
| description = test_result.get("test_results_description") or "" | ||
| assert "sample_percent" not in description |
There was a problem hiding this comment.
Run Black to satisfy the pre-commit hook.
CI indicates Black reformats this file; please run Black to fix formatting before merge.
🧰 Tools
🪛 GitHub Actions: Run pre-commit hooks
[error] 1-1: Black formatting changed the file. Run 'black' to format the code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@integration_tests/tests/test_result_rows_count.py` around lines 1 - 88, This
file fails Black formatting; run Black (or your project's pre-commit formatter)
on the test module containing functions test_result_rows_do_not_exceed_failures,
test_sample_percent_adds_description_note, and
test_no_sample_percent_no_description_note to apply the canonical formatting and
update the commit so CI's pre-commit checks pass.
…e_percent value Remove the custom generic test macro and associated tests that were hard to exercise through the integration test framework. Also tighten the sample_percent check to validate it's a number between 0 and 100. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Avoid prepending a space when test_results_description is empty/None. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Summary
result_rowsto match dbt's failure count when re-execution of non-deterministic queries (e.g.TABLESAMPLE SYSTEM,RAND(),LIMITwithoutORDER BY) returns more rows than the original run, fixing confusing UI display like "(5/4)"sample_percentis present in test params, informing users that result samples may not exactly match the failure countTest plan
test_result_rows_do_not_exceed_failures— verifies result_rows count never exceeds dbt failure counttest_sample_percent_adds_description_note— verifies sampling note is appended whensample_percentis in test_paramstest_no_sample_percent_no_description_note— verifies no sampling note for regular teststest_sampling.py,test_override_samples_config.py) continue to pass since truncation only applies whenresult_rows > failures🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes