-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding resource limits that can be selected from the UI #3202
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paloma-rebuelta - thank you for acting on this so quickly - its much appreciated! Overall, this looks great!
I had a general comment regarding the relationship between limits and requests and how to enforce conditions we'd like to impose.
bbbd018
to
693b25e
Compare
@romeokienzler @lresende maybe we should merge in Airflow 2.x related changes to processor airflow and airflow jinja2 template first before the changes here. @paloma-rebuelta in Airflow 2, a lot of changes happened and resource requests and limits are now specified a little differently, see my Airflow 2.x PR 3167 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this introduces a migration issue that either needs to be handled (or avoided).
Also, can fractional values for CPUs be entered, despite the schema type being integer
?
@@ -156,9 +156,9 @@ | |||
"mounted_volumes": [], | |||
"filename": "producer.ipynb", | |||
"runtime_image": "tensorflow/tensorflow:2.8.0", | |||
"cpu": 1, | |||
"cpu_request": 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See "migration comment".
Regarding the screen layout (thank you for including that), I find it more intuitive to have CPU request and limit on the same line, rather than CPU and Memory request, then CPU and Memory limit. I think keeping the "subject" on its own line would be better, and is also consistent relative to the CPU line (as its fields are on the same line as well). Thanks. |
693b25e
to
f0b8a37
Compare
|
@shalberd - what is the ETA of the Airflow 2 PR - its been open for some time now. That PR speaks of really wanting limit settings and the changes to add CPU/Memory requests are trivial. In addition, this PR still supports Airflow 1x while I can't determine if your PR replaces that support or not (although I suspect it does). As a result, I think this PR should take priority (assuming we don't require a migration), set the limit handling in place against existing user installations (preferrably with the release), then release support for Airflow 2x. If 2x support is a replacement, that's another issue worth discussing (if it hasn't been discussed already). |
8b32b00
to
9fa145a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this great addition @paloma-rebuelta!
We can address millicores (and values < 1 core) in another PR, which will likely require a pipeline migration, at which time we can add proper adjustments to the request-related resources if we choose.
@lresende, since this change adds a helpful feature (and is low risk IMO), I wonder if we should include it in 3.x and use Airflow 2 support as the trigger for 4.x. Are there other merged PRs that are only slated for 4.x? If we do decide to include this in 3.x we should probably adjust the changelog updates. |
@kevin-bates @lresende Happy New Year. I agree that this PR here should take precedence and is low risk, as adding optional limits is something that is extremely useful no matter what target runtime and is backward compatible. |
@kevin-bates / @lresende Happy new year! I am happy to edit the milestone, is there a specific version of 3.x I should put? 3.15.1 or 3.16.0 maybe? |
@paloma-rebuelta we will make this part of release 4.0 (confirmed by @lresende) , looks good. I'll have a look at the code one more time, too, and test it. Having @kevin-bates approved it and your extensive discussions already make this seem mergeable. |
9fa145a
to
0bd13c4
Compare
Adding Limit/request tests Signed-off-by: Author Name [email protected] Signed-off-by: Paloma Rebuelta <[email protected]> change rows for UI Signed-off-by: Paloma Rebuelta <[email protected]> restoring original name for requests
0bd13c4
to
60a27d4
Compare
alos wondering about the env var minus GUI approach vs. env var and GUI approach discussed here before with @kevin-bates #3168 (comment) |
@shalberd - regarding:
I'd still prefer we include this in a 3.x release since its independent of AF 2 support. I guess I'd prefer to see @lresende comment on my comment above regarding where this should be merged before concluding its only in 4.0.
I'm a big fan of envs as default values, but also feel that (for this application in particular) having a GUI presence is preferred. I believe the previous discussion assumed that the GUI work was not able to be done, so we'd start with envs, then, when a GUI is in place, use the previous env-based effort to represent defaults/preferences. Since @paloma-rebuelta will be receiving "extra credit" on this by actually introducing the GUI changes (thank you!), the env portion of things is a bit muted IMHO. |
Agreed, just wanted to hear your perspective on this. We had customers on Github issues here that mentioned they have a requirement for limits to be set. So the only added perspective of an env in this new GUI approach would be: set a limit by factor * request if, and only if, no value is set in a node for a limit, across all nodes. So, I'd say it's muted, but not entirely gone. However, we could cover that with a separate enhancement ticket and pull request, as it is kind of an extension for a special customer case. |
I was suggesting we merge to the main branch and that I would push for the next release to be 4.0 with support for JupyterLab... if others are willing to maintain a 3.x branch I am open for it. At the moment I won't be able to manage 3.x and 4.x releases in parallel. Feel free to merge this to main once you are happy with it @kevin-bates |
@kevin-bates let's hold off merging for a bit until we clear up the questions related to node editor limits field placement and cascading style sheets, as well as the question as to why for a single file "Submit as pipeline" does not show the new fields. all good, I forgot to do the the only cosmetic thing I do not like so much is that the content on the Submit File to pipeline dialog is now pretty crowded regardless of how wide or resized that popup / overlay window is basically, there is no vertical space between the field columns anymore because it has to fit in the fixed size width of the whole canvas. Compare to before @paloma-rebuelta @kevin-bates /lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great job, working nicely, minor cosmetic GUI comment in my last comment under conversations, no big deal. Thank you so very much, this is so very valuable. Love using it already.
@kevin-bates please disregard my last comment, PR is good to merge
No can do either. So this will first appear in 4.0. Thanks for the response @lresende |
Thanks @kevin-bates @shalberd and @lresende for getting this across 🥳 |
you're welcome; this has been a long-standing wish of many people in the community, so your effort on this is extremely appreciated by not just us, I am sure. |
Signed-off-by: Author Name [email protected] Signed-off-by: Paloma Rebuelta <[email protected]>
I have changed the backend to replace cpu and memory for cpu_request and memory_request, as well as adding cpu_limit and memory_limit. This has also been added to the UI.
These changes should fix the issue where the pipeline breaks when there are pre-defined argo resource limits that are lower than the requests, fixes #3168
milestone 4.0
I have tested this by adding two unit tests and building the code to run in a kubeflow server. I tested a pipeline with all the resource requests/limits combinations (including leaving it blank). I cannot run airflow pipelines in my setup but have also edited the code for these.