feat: lint sources #78

otosky · 2024-10-10T04:02:45Z

Overview

Extends rules so that they can be used to against dbt sources in addition to models.

Usage

A rule defines what resource-type it acts against in the type signature of the function it wraps or in a class-based evaluate method:

from dbt_score import Model, Source rule, Rule, RuleViolation

# decorator-based
# for a Model
@rule
def model_has_description(model: Model) -> RuleViolation | None:
    """A model should have a description."""
    if not model.description:
        return RuleViolation(message="Model lacks a description.")

# for a Source
@rule
def has_description(source: Source) -> RuleViolation | None:
    """A source should have a loader defined."""
    if not source.loader:
        return RuleViolation(message="Source lacks a loader.")

# class-based
class ExampleSource(Rule):
    """Example class-based rule."""

    description = "A source should have a loader defined."

    def evaluate(self, source: Source) -> RuleViolation | None:
        """Evaluate source."""
        if not source.loader:
            return RuleViolation(message="Source lacks a loader.")

The Evaluation handler is then responsible for applying source-rules to Source objects and model-rules to Model objects.

closes #76

otosky · 2024-10-11T00:48:00Z

woops thought that I opened this PR against my own fork - will move out of draft when I've added the rest of the feature 🙂

matthieucan · 2024-10-11T07:50:37Z

woops thought that I opened this PR against my own fork - will move out of draft when I've added the rest of the feature 🙂

No problem, sounds good!

src/dbt_score/rule.py

otosky · 2024-10-15T02:46:52Z

@matthieucan - let me know if you think this is heading in the right direction

jochemvandooren

I think this looks very good already! 🚀 Just a couple of things we need to think about:

Filters cannot be applied to sources now
Code has model references almost everywhere (oops 😅, that was me probably), we should change that. E.g. scorer.py even has methods like score_model that will now be applied to sources as well.
Can we distinguish source and model in the formatters?
Needs some documentation in /docs!

Let me know what you think about it, I am happy to help if needed!

src/dbt_score/evaluation.py

src/dbt_score/__init__.py

otosky · 2024-10-17T14:07:53Z

Can we distinguish source and model in the formatters?

@jochemvandooren

For the human-readable formatter, my thought was to prefix models with M: and sources with S:, but let me know if you have a different idea.

Example:

🥇 M:model1 (score: 10.0)
    OK   tests.conftest.rule_severity_low
    ERR  tests.conftest.rule_severity_medium: Oh noes
    WARN (critical) tests.conftest.rule_severity_critical: Error

🥇 S:source1.table1 (score: 10.0) 
    OK   tests.conftest.rule_severity_low
    ERR  tests.conftest.rule_severity_medium: Oh noes
    WARN (critical) tests.conftest.rule_severity_critical: Error

jochemvandooren · 2024-10-21T12:54:14Z

Can we distinguish source and model in the formatters?

@jochemvandooren

For the human-readable formatter, my thought was to prefix models with M: and sources with S:, but let me know if you have a different idea.

Example:
🥇 M:model1 (score: 10.0)
    OK   tests.conftest.rule_severity_low
    ERR  tests.conftest.rule_severity_medium: Oh noes
    WARN (critical) tests.conftest.rule_severity_critical: Error

🥇 S:source1.table1 (score: 10.0) 
    OK   tests.conftest.rule_severity_low
    ERR  tests.conftest.rule_severity_medium: Oh noes
    WARN (critical) tests.conftest.rule_severity_critical: Error

I like the idea, keeps it concise! 👍

jochemvandooren

This has become a huge PR with all the renaming and stuff, great effort 🙌 I have played around with the code locally and everything works perfectly! Just left some comments about getting rid of all mentions of model in the code.

Also, all of the documentation needs to be updated as well, I am happy to assist with this one, please let me know!

pyproject.toml

src/dbt_score/evaluation.py

src/dbt_score/rule.py

src/dbt_score/rule_filter.py

src/dbt_score/scoring.py

src/dbt_score/cli.py

src/dbt_score/models.py

src/dbt_score/rule_filter.py

src/dbt_score/cli.py

otosky · 2024-10-22T03:31:33Z

Thanks for a thorough review @jochemvandooren ! Will address remaining feedback, update the formatters, and take a first stab at some of the docs shortly! I will definitely take up your offer for support on the docs.

matthieucan

Incredible work @otosky !
I played a bit with it and it works fine! Very much looking forward to have this 💪

src/dbt_score/models.py

docs/create_rules.md

jochemvandooren · 2024-10-24T14:33:07Z

Thanks for a thorough review @jochemvandooren ! Will address remaining feedback, update the formatters, and take a first stab at some of the docs shortly! I will definitely take up your offer for support on the docs.

Next week I will help you on the docs and further review the PR, thanks for all the amazing work already 🙌

jochemvandooren · 2024-10-31T10:37:55Z

Thanks for a thorough review @jochemvandooren ! Will address remaining feedback, update the formatters, and take a first stab at some of the docs shortly! I will definitely take up your offer for support on the docs.

Next week I will help you on the docs and further review the PR, thanks for all the amazing work already 🙌

@otosky The code looks perfect! As promised I did a final check and tried to find all occurrences of model and replaced it by something more appropriate 🔍 d595ea2

I also added a CHANGELOG.md entry already, feel free to improve of course. Final thing that's remaining is some linting errors on lines exceeding line length. Once those are fixed I am ready to approve 🚀

otosky · 2024-10-31T14:48:00Z

Thank you @jochemvandooren!

Fixed the line lengths in a2c0924 and made some tweaks to the changelog in f88aa7c.

One final Q: I made use of two functions from more-itertools - first and first_true. more-itertools came as a sub-dependency of dbt, which I see has now been made a dev-dep. Would you like me to rewrite/vendor the 2 functions I used above? Or add more-itertools as proper dependency?

jochemvandooren · 2024-10-31T14:55:41Z

Thank you @jochemvandooren!

Fixed the line lengths in a2c0924 and made some tweaks to the changelog in f88aa7c.

One final Q: I made use of two functions from more-itertools - first and first_true. more-itertools came as a sub-dependency of dbt, which I see has now been made a dev-dep. Would you like me to rewrite/vendor the 2 functions I used above? Or add more-itertools as proper dependency?

Ah, good point! If it can be prevented easily, I'd like to keep the number of dependencies low. But I can imagine having to rewrite the function is a hassle, I will leave it up to you!

otosky · 2024-10-31T15:59:07Z

I went the vendoring route, since it's really just the function first_true that is nicer syntax sugar and used in more than one place. 👍

matthieucan · 2024-10-31T17:52:49Z

I went the vendoring route, since it's really just the function first_true that is nicer syntax sugar and used in more than one place. 👍

I believe more_itertools.py was not committed?

otosky · 2024-10-31T20:16:01Z

I went the vendoring route, since it's really just the function first_true that is nicer syntax sugar and used in more than one place. 👍

I believe more_itertools.py was not committed?

🤦 totally right - just pushed it up!

jochemvandooren

Amazing 🤩 , just needs a rebase on master!

matthieucan

Minor nitpicks.

Incredible work @otosky, I'm really keen to start using this feature. Thanks a lot! 💪

CHANGELOG.md

src/dbt_score/rule_filter.py

…h to

Co-authored-by: Matthieu Caneill <[email protected]>

otosky · 2024-11-01T15:38:32Z

src/dbt_score/formatters/manifest_formatter.py

-            manifest["nodes"][model_id]["meta"]["score"] = model_score.value
-            manifest["nodes"][model_id]["meta"]["badge"] = model_score.badge
+        for evaluable_id, evaluable_score in self._evaluable_scores.items():
+            manifest["nodes"][evaluable_id]["meta"]["score"] = evaluable_score.value


Noticing after taking another pass at the diff that sources need to be pushed into manifest["sources"] instead of manifest["nodes"]. Let me fix that quickly.

Oh good one, we should have a test for that

updated in cc6b707

src/dbt_score/formatters/manifest_formatter.py

jochemvandooren · 2024-11-07T07:19:39Z

It seems mypy was upgraded when rebasing, and it introduced some new linting errors 😅 I can help you fix them if you would like! Let me know

otosky · 2024-11-08T02:12:40Z

@jochemvandooren I'll take an initial stab at it!

otosky · 2024-11-12T05:49:49Z

There are still a bunch of Liskov violations being raised from mypy that I'm not entirely sure how to fix without having to introduce Generics into the API.

tests/test_rule.py:53: error: Argument 1 of "evaluate" is incompatible with supertype "Rule"; supertype defines the argument type as "Model | Source"  [override]
tests/test_rule.py:53: note: This violates the Liskov substitution principle
tests/test_rule.py:53: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides

Is there any other workaround besides adding an ignore here? This essentially means that mypy will fail for any downstream users if they use the class-based workflow while developing their rules.

jochemvandooren · 2024-11-12T08:32:42Z

There are still a bunch of Liskov violations being raised from mypy that I'm not entirely sure how to fix without having to introduce Generics into the API.
tests/test_rule.py:53: error: Argument 1 of "evaluate" is incompatible with supertype "Rule"; supertype defines the argument type as "Model | Source"  [override]
tests/test_rule.py:53: note: This violates the Liskov substitution principle
tests/test_rule.py:53: note: See https://mypy.readthedocs.io/en/stable/common_issues.html#incompatible-overrides
Is there any other workaround besides adding an ignore here? This essentially means that mypy will fail for any downstream users if they use the class-based workflow while developing their rules.

I also spent some time looking into this, and there's no easy solution indeed 😞 Considering this will only affect the class-based rules, I suggest we add an ignore for now. I am aware it will introduce the warnings downstream, which isn't ideal. To solve this in a nice way, we might need to restructure some things if we want to keep the API as is, so we might consider this in a follow-up PR. What do you think? I think it's the most pragmatic option!

otosky · 2024-11-12T15:37:54Z

I also spent some time looking into this, and there's no easy solution indeed 😞 Considering this will only affect the class-based rules, I suggest we add an ignore for now. I am aware it will introduce the warnings downstream, which isn't ideal. To solve this in a nice way, we might need to restructure some things if we want to keep the API as is, so we might consider this in a follow-up PR. What do you think? I think it's the most pragmatic option!

makes sense to me @jochemvandooren! - updated in 0732e15

jochemvandooren · 2024-11-12T16:17:42Z

Well done @otosky! Great contribution 🙌

jochemvandooren · 2024-11-13T07:44:02Z

Also it's available in version 0.8.0 now

otosky · 2024-11-20T00:09:32Z

@jochemvandooren @matthieucan thanks again for all your assistance!

otosky commented Oct 15, 2024

View reviewed changes

src/dbt_score/rule.py Outdated Show resolved Hide resolved

otosky marked this pull request as ready for review October 15, 2024 02:45

jochemvandooren reviewed Oct 15, 2024

View reviewed changes

src/dbt_score/evaluation.py Outdated Show resolved Hide resolved

src/dbt_score/evaluation.py Outdated Show resolved Hide resolved

src/dbt_score/evaluation.py Outdated Show resolved Hide resolved

src/dbt_score/__init__.py Outdated Show resolved Hide resolved

jochemvandooren reviewed Oct 21, 2024

View reviewed changes

otosky commented Oct 22, 2024

View reviewed changes

src/dbt_score/cli.py Outdated Show resolved Hide resolved

matthieucan reviewed Oct 22, 2024

View reviewed changes

src/dbt_score/models.py Show resolved Hide resolved

otosky commented Oct 24, 2024

View reviewed changes

docs/create_rules.md Outdated Show resolved Hide resolved

jochemvandooren approved these changes Nov 1, 2024

View reviewed changes

matthieucan approved these changes Nov 1, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

src/dbt_score/rule_filter.py Show resolved Hide resolved

otosky added 8 commits November 1, 2024 11:21

add sources to manifest.json fixture

769dc07

add minimal test assertion for parsing sources

40d3e2e

fmt json

1ee0efc

add Source model and parse into ManifestLoader

e33d8b8

add fixtures for sources

480b41e

infer resource_type from Rule.evaluate type annotation

2fa3d18

evaluate rules for both sources and models

141aa3c

allow sources to be filtered

17ef4be

otosky and others added 12 commits November 1, 2024 11:21

move check for resource_type match to should_evaluate method

e095ffc

update docs

836bb68

rename test_model_filter -> test_rule_filter

fea456b

add newline to pyproject.toml

ea7691f

validate that filters match the resource type of the rules they attac…

1122273

…h to

Final renaming of models to include sources

eca0b1e

fix line lengths

cc40c46

update changelog

a5b2ac1

remove hard dep on more-itertools by vendoring first_true

8ca822f

actually commit more_itertools replacement

30900c4

add newline

6204bc4

Co-authored-by: Matthieu Caneill <[email protected]>

remove breaking notice

7ecc1b2

otosky force-pushed the lint-sources branch from d5247ef to 7ecc1b2 Compare November 1, 2024 15:26

otosky commented Nov 1, 2024

View reviewed changes

fix manifest_formatter for source scores

cc6b707

otosky commented Nov 1, 2024

View reviewed changes

src/dbt_score/formatters/manifest_formatter.py Show resolved Hide resolved

otosky added 2 commits November 1, 2024 22:49

run prettier on changelog

ba16e64

fix import

5ef7cae

address mypy errors

e19badc

mypy ignore Liskov violations on class-based rule/filter

0732e15

jochemvandooren merged commit 8aa8aad into PicnicSupermarket:master Nov 12, 2024
3 checks passed

matthieucan mentioned this pull request Feb 13, 2025

No SNAPSHOTS??? #95

Closed

feat: lint sources #78

feat: lint sources #78

Uh oh!

Conversation

otosky commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Usage

Uh oh!

otosky commented Oct 11, 2024

Uh oh!

matthieucan commented Oct 11, 2024

Uh oh!

Uh oh!

otosky commented Oct 15, 2024

Uh oh!

jochemvandooren left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

otosky commented Oct 17, 2024

Uh oh!

jochemvandooren commented Oct 21, 2024

Uh oh!

jochemvandooren left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

otosky commented Oct 22, 2024

Uh oh!

matthieucan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jochemvandooren commented Oct 24, 2024

Uh oh!

jochemvandooren commented Oct 31, 2024

Uh oh!

otosky commented Oct 31, 2024

Uh oh!

jochemvandooren commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

otosky commented Oct 31, 2024

Uh oh!

matthieucan commented Oct 31, 2024

Uh oh!

otosky commented Oct 31, 2024

Uh oh!

jochemvandooren left a comment

Choose a reason for hiding this comment

Uh oh!

matthieucan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

otosky Nov 1, 2024

Choose a reason for hiding this comment

Uh oh!

matthieucan Nov 1, 2024

Choose a reason for hiding this comment

Uh oh!

otosky Nov 1, 2024

Choose a reason for hiding this comment

Uh oh!

otosky commented Oct 10, 2024 •

edited

Loading

jochemvandooren commented Oct 31, 2024 •

edited

Loading