-
Notifications
You must be signed in to change notification settings - Fork 1.5k
perf: AST compiler optimizations #7740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: AST compiler optimizations #7740
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This is really testing OPA rather than Regal, but since it's a good test case for improving OPA for Regal, it belongs here. Used in open-policy-agent/opa#7740 Signed-off-by: Anders Eknert <[email protected]>
This is really testing OPA rather than Regal, but since it's a good test case for improving OPA for Regal, it belongs here. Used in open-policy-agent/opa#7740 Signed-off-by: Anders Eknert <[email protected]>
This is really testing OPA rather than Regal, but since it's a good test case for improving OPA for Regal, it belongs here. Used in open-policy-agent/opa#7740 Signed-off-by: Anders Eknert <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial comments. Will wrap the complete review later.
299b3ef
to
be93165
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
My only concern is the change in evalNot()
.
return errs | ||
} | ||
|
||
reversed := make(map[Var]Var, len(dvs.vs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is only used if we need to report an error. And isn't used in a successful compilation. Could be done lazily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code above should ensure it's only created when there are actual unused vars, which is the real improvement from before:
if len(unused) == 0 {
return errs
}
Over-allocating is almost always better than allocating multiple times, like when a maps has to grow. It's true that this could be made even more efficient though if we did some of the logic below here first to count and then to allocate.
|
||
unused := dbv.Diff(used) | ||
|
||
reversed := make(map[Var]Var, len(dvs.vs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this change, we built the reverse map as we added entries. This is cheaper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, as previously we would pre-allocate the map on the dvs struct, and now we don't allocate anything in most cases.
v1/topdown/eval.go
Outdated
child.traceExit(negation) | ||
child.traceRedo(negation) | ||
if err := child.eval(func(c *eval) error { | ||
if c.traceEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure c
is the same instance as child
here? E.g. what if evaluation spawns additional children?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. reverted
Funnily, this started out as an attempt to look into issues reported with compiling large policy sets... before I realized that it isn't likely *this* compiler that has perf issues, but the one that "compiles" bundles as part of activation. So while these fixes likely does little to address that, there are still some rather nice improvements here, where the big ones as ususal are mostly just wins from avoiding work where it's possible. For benchmarking I've used Regal's embedded bundle, which isn't great to use over time, as it's a moving target. But since it's a pretty extensive bundle and one that covers most features of OPA, it's at least good for 1:1 comparisons when testing perf improvements. ``` // 66555594 ns/op 50239492 B/op 1083664 allocs/op - main // 62569440 ns/op 38723015 B/op 944277 allocs/op - compiler-optimizations pr ``` The B/op / alloc_space improvement is particularly nice here. What's noteworthy is how relatively little impact that has on performance in this case. That may be surprising but aligns pretty well with my previous experience of Go code where a lot of time is spend in recursive walks — that simply takes time, no matter how much you optimize. Oh well, less memory allocated for this is more memory to spend elsewhere. (I'm adding the benchmark used below to Regal in a parallel PR) Signed-off-by: Anders Eknert <[email protected]>
be93165
to
a4f34ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Funnily, this started out as an attempt to look into issues reported with compiling large policy sets... before I realized that it isn't likely this compiler that has perf issues, but the one that "compiles" bundles as part of activation. So while these fixes likely does little to address that, there are still some rather nice improvements here, where the big ones as ususal are mostly just wins from avoiding work where it's possible.
For benchmarking I've used Regal's embedded bundle, which isn't great to use over time, as it's a moving target. But since it's a pretty extensive bundle and one that covers most features of OPA, it's at least good for 1:1 comparisons when testing perf improvements.
The B/op / alloc_space improvement is particularly nice here. What's noteworthy is how relatively little impact that has on ns/op in this case. That may be surprising but aligns pretty well with my previous experience of Go code where a lot of time is spend in recursive walks — that simply takes time, no matter how much you optimize. Oh well, less memory allocated for this is more memory to spend elsewhere.
(I'm adding the benchmark used below to Regal in a parallel PR)