-
Notifications
You must be signed in to change notification settings - Fork 388
test: add compute cluster cost func for perf testing #2602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jigisha620 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| // them against pricing data from env.InstanceTypes. Returns 0.0 if no nodes or pricing data exists. | ||
| // | ||
| //nolint:gocyclo | ||
| func (env *Environment) GetClusterCost() float64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if a metric scraping approach is more extensible. Karpenter should emit cost metrics soon, and if we want to set other performance SLAs we might want to do it via metrics scraping. What are your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move to metrics once we have them, but until then I think this is the most straightforward way to get the cluster cost.
d395a25 to
bda31a5
Compare
Pull Request Test Coverage Report for Build 18954237020Details
💛 - Coveralls |
jigisha620
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter perf
jigisha620
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter perf
8d8477b to
bda31a5
Compare
bda31a5 to
c1f8ed6
Compare
| for _, path := range paths { | ||
| if path != "" { | ||
| if data, err := os.ReadFile(path); err != nil { | ||
| return nil, fmt.Errorf("could not read instance types file %s: %w", path, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| return nil, fmt.Errorf("could not read instance types file %s: %w", path, err) | |
| return nil, serrors.Wrap(fmt.Errorf("could not read instance types file, %w", err), "file", path) |
|
|
||
| instanceTypes, err := kwok.ConstructInstanceTypes(ctx) | ||
| if err != nil { | ||
| log.Default().Printf("failed constructing instance types: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also be a fatal error?
| log.Default().Printf("failed constructing instance types: %v", err) | |
| log.Default().Printf("failed constructing instance types, %w", err) |
| // Get the output dir if it's set | ||
| outputDir, _ := os.LookupEnv("OUTPUT_DIR") | ||
|
|
||
| instanceTypes, err := kwok.ConstructInstanceTypes(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the idea of us taking a direct dependency on the kwok provider implementation inside our common testing package. If we were to have this type of util, I would constrain it to a dedicated kwok testing package so that kwok details don't leak into our common packages. This problem goes away though if we just wait to rely on the upstream metric.
|
|
||
| By("Calculating cluster cost") | ||
| nodes := &corev1.NodeList{} | ||
| Expect(env.Client.List(env, nodes, client.HasLabels{test.DiscoveryLabel})).To(Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with this approach, we should scrape NodeClaims rather than nodes since they exist for the instance's entire lifetime wheras nodes only exist for a subset of it.
| OS: node.Labels["kubernetes.io/os"], | ||
| Arch: node.Labels["kubernetes.io/arch"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we care about OS and arch?
Fixes #N/A
Description
This PR adds a function to calculate cluster cost. This can be used to measure performance. For example - when Karpenter performs consolidation, this function can be used to compare cost before and after consolidation.
How was this change tested?
https://github.com/jigisha620/karpenter/actions/runs/18952021119/job/54118913345
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.