Skip to content

Add model/table-level macro for group_mean_continuity_check #4116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

krivard
Copy link
Contributor

@krivard krivard commented Mar 6, 2025

Alternate implementation of #4092. Would contribute to completion of #4095.

Where #4092 implements this check at the column level, this PR implements it at the model/table level, which significantly reduces the amount of duplicated information in the schema.yml files that use it... at the possible expense of having to build a pretty complicated query.

Overview

Three tables require a group_mean_continuity_check, which does the following:

  1. Group by a specified column
  2. Compute the mean of each group
  3. Compute the percent change between successive groups
  4. Check against a per-column threshold

This PR contains a draft for a possible implementation.

What did you change?

  • New macro, group_mean_continuity_check(ordered_group_column, thresholds, n_outliers_allowed)
  • Demo of macro in use for _core_eia923__cooling_system_information, using columns and thresholds from transform/eia923.py

Includes:
- Test macro for group_mean_continuity_test
- Example usage in _core_eia923__cooling_system_information
- Documentation in macros/schema.yml
@@ -0,0 +1,28 @@
version: 2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is apparently the recommended way to document macros?

https://docs.getdbt.com/faqs/Docs/documenting-macros

@krivard krivard added dbt Issues related to the data build tool aka dbt data-validation Issues related to checking whether data meets our quality expectations. labels Mar 10, 2025
@krivard krivard self-assigned this Mar 10, 2025
@aesharpe
Copy link
Member

See my comment here

@krivard
Copy link
Contributor Author

krivard commented Mar 19, 2025

Dropping in favor of #4092!

@krivard krivard closed this Mar 19, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Catalyst Megaproject Mar 19, 2025
@krivard krivard deleted the krivard/dbt-migrations_group_mean_continuity_modeltest branch March 19, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-validation Issues related to checking whether data meets our quality expectations. dbt Issues related to the data build tool aka dbt
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants