-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect: set meaningful (fallback) defaults for job resources #630
Conversation
This would apply the default values globally which I think is too heavy handed. This setting would apply to all environments restores, content execution, and document renders. In the meantime I think a better approach would be to modify the
helm/charts/rstudio-connect/values.yaml Lines 290 to 311 in 2cdb401
We'll also need to make sure that we always use the launcher Job's resource requests/limits if they are set, so we'll have to merge helm/charts/rstudio-connect/files/job.tpl Lines 247 to 284 in 2cdb401
cc @bschwedler - is this something that the workbench chart would benefit from as well if we modify the |
I was aware of this when proposing the change. Right now, there are no defaults set which causes global Openshift defaults to jump in (of course global resource defaults are also possible on vanilla k8s). These will be too low in many scenarios, causing While a user configurable default is surely welcome, I still think that sensible defaults should be defined in the chart that prevent these cases from happening. Admins can still override these values then in the chart themselves if desired (and once exposed). The limits proposed in this PR are not particularly optimized and lower values might also suffice, however I had no motivation to benchmark the best lowest value that would still work in most cases. |
As you've already mentioned a namespace default using a
Setting the default cpu/memory requests/limits aside for now, I still think your PR needs to merge |
The workaround I mentioned was done by patching a local copy of the chart and changing the value of the job template. This shouldn't apply to all jobs in the namespace, should it? Besides, I think the vast majority of users are not running connect in a mixed namespace. But sure, there might be some that don't care about namespacing.
Agreed, good point. I also noticed that the implementation needs to be more granular for CPU and mem, e.g if users set one of the two in the app resources, the else condition will be skipped and the namespace default will kick in for the missing one, leading to potentially the same issue as when omitting both. |
34d35ef
to
300efaf
Compare
@dbkegley I've made the conditinoal checks more granular. This should now also work if only Untested though, needs a proper review. |
For sure, that's right. I just wanted to give another option for anyone following along who wants to make this change without waiting for this PR to merge or modifying the
Thanks for making those changes! I'll take another look this afternoon |
memory: 1Gi | ||
limits: | ||
cpu: 500m | ||
memory: 1Gi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my preference would be for these defaults to be defined in the values.yaml
rather than here in the template. We'll want to add a new config section under launcher.templateValues.pod.resources
so that the configured values are available inside of $templateData.pod.resources
, then we can merge those values with the values provided on the .Job.resourceLimits
- this would be more consistent with how the other job overrides work.
Take VolumeMounts
for example:
{{- if or (ne (len .Job.volumes) 0) (ne (len $templateData.pod.volumeMounts) 0) }}
volumeMounts:
{{- range .Job.volumeMounts }}
- {{ nindent 14 (toYaml .) | trim -}}
{{- end }}
{{- range $templateData.pod.volumeMounts }}
- {{ nindent 14 (toYaml .) | trim -}}
{{- end }}
{{- end }}
Except in this case we'll want to prefer values from the .Job
and fall back to values in $templateData
rather than appending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbkegley Feel free to make changes as needed and restructure the way you consider it optimal. My current workaround does the job for the instances at hand, happy to migrate to a proper upstream fix in an upcoming release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
close this in favor of #634
Without these,
packrat-restore
jobs will fail on clustersresources:
settings insteadThis is often the case on Openshift.
Defining these fallback values safeguards
packrat-restore
jobs for new apps which don't have any resource definitions set and hence run into the condition described above.