-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic dbt objects and rule definitions #6
Conversation
c243c5b
to
794e79b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bunch of comments, but this looks very promising! 😍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really looks great! 💯 I also left a couple of comments/ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, congrats on the @rule
implementation 😍
Thoughts on unit tests already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure you've thought a lot about this setup and I'd also like to learn more so here goes:
What is the reason @rule
returns a subclass of Rule, rather than a Rule object? That is, instead of:
Callable[..., Type[Rule]]
Why not:
Callable[..., Rule]
Generating subclasses, of which there will only be one object per subclass, feels similar to generating objects, i.e. specific instances of Rule
, but the latter is semantically closer to what you'd want (I think).
rule
would then look something like this:
def rule(
description: str | None = None,
severity: Severity = Severity.MEDIUM,
) -> Callable[[Callable[[Model], RuleViolation | None]], Rule]:
"""Rule decorator.
The rule decorator creates a Rule and returns it.
Args:
description: The description of the rule.
severity: The severity of the rule.
"""
def decorator_rule(
func: Callable[[Model], RuleViolation | None],
) -> Rule:
"""Decorator function."""
...
def wrapped_func(self: Rule, *args: Any, **kwargs: Any) -> RuleViolation | None:
"""Wrap func to add `self`."""
return func(*args, **kwargs)
rule_obj = Rule(
description=...,
severity=...,
evaluate=wrapped_func,
)
return rule_obj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the reasons is to allow users to create more advanced rules by subclassing Rule
(simpler rules are encouraged to use the diet @rule
decorator). Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is indeed very similar, but this way it will be easier for users to create their own complex rules with a class. So this gives full flexibility for users that need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But can't they still do that even in this setup?
@rule
def some_simple_rule():
...
class ComplexRule(Rule):
"""Do the thing, as long as it implements `evaluate`"""
...
Then in the rule registry / runner:
# In the real registry this discovery happens differently of course
rules: Rule = [
some_simple_rule(), # Instantiate a Rule object
ComplexRule(), # Instantiate a ComplexRule object, which is a subclass of Rule
...
]
for rule in rules:
rule.evaluate(model)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snippet above is how it currently works. But with your suggestion of creating instances instead of classes in the decorator, some_simple_rule
would already be an instance, therefore some_simple_rule()
would not work, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I missed the last return
🙈
def rule(..) -> Callable[[Callable[[Model], RuleViolation | None]], Rule]:
def decorator_rule(
func: Callable[[Model], RuleViolation | None],
) -> Rule:
def wrapped_func(self: Rule, *args: Any, **kwargs: Any) -> RuleViolation | None:
return func(*args, **kwargs)
rule_obj = Rule()
return rule_obj
return decorator_rule
Which should result in:
@rule
def some_simple_rule():
...
some_simple_rule # Function object that returns a Rule object
some_simple_rule() # Instance of Rule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. But I think it's more straightforward to go with the approach: "every subclass of Rule
is a rule". Such subclass can be created either by subclassing, or by using @rule
as syntactic sugar. This puts all rules "at the same level", if that makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it, both approaches are the same in the sense that "every Rule
is a rule".
I understand that, by always subclassing Rule
, all rules are on the same hierarchy. But from another perspective, they're all Rule
s and in aggregate we will work with lists of Rule
s. It doesn't matter if you subclass a Rule
or use a subclass of a subclass of Rule
or use Rule
itself.
From a user perspective, nothing will change. You either use @rule
as syntactic sugar or subclass Rule
.
Where they differ is that I think, semantically, a list of objects is more straightforward than a list of generated subclasses. And I also think it simplifies implementation a bit.
Btw, because nothing changes from a user's perspective, I wouldn't make this point blocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have given it another look and decided to go with the current implementation. Going with the instance implementation also has some challenges, e.g. the example
@rule
def some_simple_rule():
...
some_simple_rule # Function object that returns a Rule object
some_simple_rule() # Instance of Rule
will not exactly work. To call some_simple_rule()
we will need to grab the func
from somewhere again as it's an input variable. I think the implementation does not necessarily becomes easier when using instances instead of classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments - near perfect!
Co-authored-by: Matthieu Caneill <[email protected]>
Co-authored-by: Matthieu Caneill <[email protected]>
Co-authored-by: Matthieu Caneill <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the return None
question is pending, and the discussion about classes/instances is ongoing, happy to approve this design. Great work!
This PR takes care of the following:
Add basic rule structures
A rule can be added by definining a method with the
@rule
decorator or by creating a class that inherits fromRule
.Add a way to load models from the manifest
ManifestLoader takes as input the manifest.json, loaded as a dictionary. It will created a set of
Model
. AModel
is the input ofRule.evaluate
.Below is a minimal example to load a manifest.json file and apply rules to the models in the manifest.
To do