Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only output failing models and violated rules per default in HumanReadableFormatter #77

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

thomend
Copy link

@thomend thomend commented Oct 2, 2024

Fixes #71

It changes the default behavior of the HumanReadableFormatter:
When running dbt-score lint it will now only show failing models and/or rules which were violated.
To have all output printed (exactly the way it was before this PR) the user has now to pass the --show_all flag to dbt-score lint.
It also considers the fail_any_model_under option.

Furthermore, I had to adjust some of tests since the default behavior of the HumanReadableFormatter is now changed - let me know if I would need to expand this.

The docs also reflect this change by slightly tweaking the first example in index.md

In this current form, both rule violations and failing models are filtered through the same argument. This is based on @matthieucan. 's comment on #71 (comment)

src/dbt_score/cli.py Outdated Show resolved Hide resolved
@thomend thomend marked this pull request as ready for review October 4, 2024 09:14
Copy link
Contributor

@jochemvandooren jochemvandooren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! 🙌 Looks in good shape already, just some minor comments!

docs/index.md Outdated Show resolved Hide resolved
src/dbt_score/cli.py Outdated Show resolved Hide resolved
Comment on lines +37 to +39
score.value < self._config.fail_any_model_under
or any(isinstance(result, RuleViolation) for result in results.values())
or self._config.show_all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not 100% correct 🤔 If I understand correctly we want to print the following output if:

score.value < self._config.fail_any_model_under. Then, only show the failing rules. Now it will also show the failing rules of models that did not fail

Copy link
Author

@thomend thomend Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! My line of thought here was, that in case that a project as whole fails and only very few model scores are too low , I probably would be interested as a user to also see failing rules of all models: Imagine you have 100 models, only ~5 fail but ~20 have lowish scores while 75 are perfect. In that case it could be of interest to also see the failing rules of all models.

I also referred to this in the issue discussion: #71 (comment). But I guess in that case one could just increase fail_any_model_under. Probably this is a bit too implicit and I could remove this (e.g only test for score.value < self._config.fail_any_model_under).

Just let me know which way you prefer and then I adjust it accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I got it! 👍 I think for the scenario you describe it is indeed useful to be able to show all the failing rules and we can definitely leave that as the default. I do think that we should also have the option to show only failing models, with their failing rules.

Maybe we should have two flags: --show-all-rules and --show-all-models so the user is able to further specify the output. @matthieucan curious to hear your opinion as well!

So then the user is able to:

  1. show failing models, with all rules
  2. show failing models, with failing rules
  3. show all models, with all rules
  4. show all models, with failing rules

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to align on those expectations indeed.
Maybe the --show parameter could take an argument, e.g.

  • --show all - show all models, all rules
  • --show failing-models - show failing rules of failing models
  • --show failing-rules - show failing rules of all models (the default?)

I'm not sure if the first scenario mentioned (show failing models, with all rules) is useful, considering the option to use --select in combination. For example --select my_model --show all might be more actionable. But let me know what you think :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right that option 1 (failing models with all rules) is probably not very useful. I think the direction of --show something would be a nice one. It's indeed simpler than providing two flags. And agreed that --show failing-rules should be the default!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good and happy to adjust it in the next few days - thanks for all the input.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of wording: The codebase often refers to "violated" rules, e.g. RuleViolation and not failing rules. What do you think of:
--show all
--show failing-models
--show violated-rules
?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense! 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's simpler to remember for users if the same term is used for the CLI options, as it's an abstraction over the code base

@@ -44,7 +75,7 @@ def test_human_readable_formatter_project(capsys, default_config, manifest_loade
)
formatter.project_evaluated(Score(10.0, "🥇"))
stdout = capsys.readouterr().out
assert stdout == "Project score: \x1B[1m10.0\x1B[0m 🥇\n"
assert stdout == "Project score: \x1b[1m10.0\x1b[0m 🥇\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did all the B's turn into b? 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hexadecimal is not case sensitive, but indeed strange to see those changed 🤔

Copy link
Author

@thomend thomend Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that! I am using the ruff vscode extension and the autoformat on save did that:
Hex codes and Unicode sequences. Although I don't know why the linter during pre-commit would not pick it up to revert it? The ruff version of the vscode extension is 0.6.6, I believe (which is newer than the one running in the pre-commit hook.
Let me know if you want it reverted.

thomend and others added 2 commits October 7, 2024 21:04
Co-authored-by: Jochem van Dooren <[email protected]>
Co-authored-by: Jochem van Dooren <[email protected]>
Comment on lines +96 to +104
@click.option(
"--show-all",
help="If set to True, show all models and all rules in output "
"when using `plain` as `--format`. "
"Default behavior is to only show failing models and violated rules.",
type=bool,
is_flag=True,
default=False,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not make it a flag, but rather a parameter with a value? i.e. --show all, which can later be expanded into different options such as --show failing-models and the like

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Add lint option to only output models with score under some threshold
3 participants