Skip to content

add more arguments to Python JobspecV1 factory methods #6858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Jun 3, 2025

This PR adds some more arguments to the JobspecV1 factory methods as described in #6833 so that more fully formed jobspec can be instantiated with a single method call.

This PR doesn't go as far as adding all the options offered by the submission CLI commands. However, job parameters that are an annoyance to set after the fact like the environment and cwd are there, as well as output, error, and input paths, etc.

One thing I didn't do was move support for some of the more user-friendly aspects of the CLI into the Python methods. For instance, the environment and rlimits arguments just take straight mappings, instead the "rule" based expressions supported by the CLI. We could change that, but I wasn't sure if it was necessary in the Python API.

Also, some other missing parameters include --dependency, --requires, and --begin-time -- all of which actually are mostly implemented in the CLI, and I wasn't sure exactly if those should go in the factory methods. They could be added later I guess.

Fixes #6833

Copy link
Member

@cmoussa1 cmoussa1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

if queue is not None:
self.setattr("queue", queue)
if bank is not None:
self.setattr("bank", bank)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're adding some flux-accounting attributes, you could consider adding project (the flux-accounting implementation of "WCKeys") as another option here too, but not a huge deal either way; I'm not sure how much use projects get these days. I can't remember if it has to be an official jobspec v1 property to be listed here, though, so if so, feel free to ignore me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we were just trying to add options that are supported in the submission CLI commands 🤷. Since the submission commands support --bank and --queue, I've added those here. However, I did leave other stuff out somewhat arbitrarily so I'm open to other ideas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I'm sorry about that! In that case, you can ignore this suggestion 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NP! It was a good point to bring up.

@grondo grondo requested a review from trws June 3, 2025 23:22
"none",
)
self.assertEqual(jobspec.getattr("system.queue"), "testq")
self.assertEqual(jobspec.getattr("system.bank"), "testbank")

def test_23_from_nest_command(self):
"""Test that `from_batch_command` produces a valid jobspec"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next door I realize but I assume this should be from_nest_command?

Suggested change
"""Test that `from_batch_command` produces a valid jobspec"""
"""Test that `from_nest_command` produces a valid jobspec"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll add a fix for that.

Comment on lines 787 to 792
if unbuffered:
# For output, set the buffer.type to none and reduce the configured
# event batch-timeout to something very small.
self.setattr_shell_option("output.stdout.buffer.type", "none")
self.setattr_shell_option("output.stderr.buffer.type", "none")
self.setattr_shell_option("output.batch-timeout", 0.05)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one seems to be the odd one out, in that it can be set with a keyword argument but not with a property is that right? I think all of the others are both at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we never did add a setter/getter for that one. Do you feel we should add that?

Oh, the other thing I did change what that the factory function arguments for file output are stdout, stderr, stdin, whereas the command line arguments are output, error, and input. I wonder if I should tweak them to match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was waffling on it when I asked, but having given it some thought I think yes. That way both modes work, and we could handle all they keyword args somewhat uniformly with a version of if <name> is not None: self.name = name since the logic can live in the property. It's not a strong preference, but feels more regular to me somehow.

On the args, that's a good point. The difference isn't ideal, I'm thinking maybe we add aliases that are the command line names, then if we want to squash the std* ones we could mark them deprecated for a while at some point.

@trws
Copy link
Member

trws commented Jun 5, 2025

I really like this, had one tiny docstring tweak and a question but it looks really good @grondo.

@wihobbs
Copy link
Member

wihobbs commented Jun 10, 2025

Hey @grondo, thanks for this improvement!

The original request from Ramesh was for explicit documentation on what could be set in the JobspecV1 factory methods vs. what had to be set by attribute modification of an initialized Jobspec. Given that, @trws requested that we be able to set everything in the factory method itself to eliminate the need for such documentation. As a stop-gap, I added this table (with your help) to chapter 2 of our upcoming tutorial:

Screenshot 2025-06-10 at 12 38 41 AM

It looks like I need to make the following modifications:

  • flux jobs is a typo
  • Everything in the table except arbitrary attributes and shell options can now be set in the factory method.

Is that correct?

grondo added 2 commits June 10, 2025 06:23
Problem: The Jobspec stdio property getters and setters have
names that do not match the CLI options, which may be not be ideal.

Rename the Jobspec stdio properties from `stdin`, `stdout`, `stderr`,
to `input`, `output`, and `error` to match the respective cli options.
Keep the existing properties as aliases for the new names.
Problem: The Jobspec stdio attributes are tested using the `stdout`,
`stderr`, and `stdin` properties, but the preferred names are now
`output`, `error` and `input` to match the cli option names.

Add unit tests for the new property names.
@grondo
Copy link
Contributor Author

grondo commented Jun 10, 2025

flux jobs is a typo

Yeah, this should be jobspec.stdout = <path> or jobspec.stderr = <path> (also --input is jobspec.stdin = <path>.

I'm about to open a PR to make jobspec.output, jobspec.error and jobspec.input aliases for these.

Everything in the table except arbitrary attributes and shell options can now be set in the factory method.
Is that correct?

Yes, though none of this work has been merged into a flux-core release, so you might just mention "More parameters will be added to the jobspec initializers in a future Flux release.

@grondo
Copy link
Contributor Author

grondo commented Jun 10, 2025

Once #6874 #6875 and #6877 are merged, I'll rebase this to pick up those changes (use unbuffered property, and rename stdout, stderr, and stdin initializer args to output, error, and input.

grondo added 12 commits June 10, 2025 08:39
Problem: The Jobspec getattr() and setattr() methods to get and set
jobspec attributes do not behave similarly. While both methods default
to an `attributes.` prefix if not provided, setattr() further defaults
to an `attributes.system.` prefix if the prefix is not `user.` or
`system.`, while getattr() does not.

Unify the behavior of these two function so they both default to
`attributes.system.`. Abstract the prepend of the default key path
to a helper function to reduce duplicate code.
Problem: The jobspec getattr tests in t0010-job.py make a note that
setattr and getattr default behavior differ, but this is no longer
true.

Remove the stale comment. Update the tests to ensure that

  jobspec.getattr("key")

works in place of

  jobspec.getattr("attributes.system.key")
Problem: There is no Python helper function to delete a dotted key
in a dictionary similar to how `set_treedict()` sets a dotted key.

Add `flux.util.del_treedict`, which works like `set_treedict()`, but
deletes the dotted key, e.g.

 del_treedict(target, "a.b.c")
Problem: There are no unit tests for `del_treedict`.

Add some tests to t/python/t0024-util.py.
Problem: There is no unbuffered getter and setter for the `Jobspec`
Python class.

Add it.
Problem: The cli commands set various attributes in Jobspec manually
when the --unbuffered option is used, but there's a property setter
for that now.

Use the jobspec.unbuffered property setter instead.
Problem: There are no tests of the Jobspec `unbuffered` property
getter and setter.

Add tests.
Problem: Common jobspec attributes and options are not available
in the JobspecV1 factory methods, `per_resource`, `from_command`,
`from_batch_command` and `from_nest_command`. This unnecessarily
requires the user to set some critical attributes of the resulting
jobspec after it is returned, for example the working directory and
environment, which makes the interface awkward and error prone.

Add a set of the likely most commonly used extra parameters to
`JobspecV1.from_command()`. Pass these args along in `**kwargs` from
the other factory methods so the extra parameters are supported there
as well.

Update `from_command()` docstring to the Google style and document
the new args. Document `**kwargs` in other functions and reference
the docs in `from_command()`.
Problem: There is code in the Python cli submission cli classes that is
duplicated by the new arguments available in JobspecV1 factory methods.

Where possible, pass arguments to JobspecV1 factory methods in the
cli classes when constructing jobspec instead of setting attributes
after the fact. Drop the duplicate code.

Where the cli commands offer special functionality not directly
supported in the JobspecV1 factory methods (such as for the jobspec
environment), keep the existing code that sets attributes after
jobspec is created.
Problem: The comment for `test_23_from_nest_command` in t0010-job.py
incorrectly mentions `from_batch_command`.

Fix the comment.
Problem: The new arguments for JobspecV1 factory functions do not
have any tests.

Add a few tests to python/t0010-job.py.
Problem: CodeQL warns that empty except clauses should include a
comment, but several except: pass clauses in flux.cli.base do
not do so.

Add comments to empty except clauses in cli/base.py.
Copy link

codecov bot commented Jun 10, 2025

Codecov Report

Attention: Patch coverage is 97.22222% with 3 lines in your changes missing coverage. Please review.

Project coverage is 83.85%. Comparing base (e699349) to head (ed4368e).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/bindings/python/flux/job/Jobspec.py 97.46% 2 Missing ⚠️
src/bindings/python/flux/cli/base.py 94.73% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6858      +/-   ##
==========================================
- Coverage   83.85%   83.85%   -0.01%     
==========================================
  Files         538      538              
  Lines       89912    89966      +54     
==========================================
+ Hits        75399    75438      +39     
- Misses      14513    14528      +15     
Files with missing lines Coverage Δ
src/bindings/python/flux/cli/alloc.py 83.95% <ø> (ø)
src/bindings/python/flux/cli/batch.py 98.38% <100.00%> (ø)
src/bindings/python/flux/util.py 96.63% <100.00%> (+0.03%) ⬆️
src/bindings/python/flux/cli/base.py 95.46% <94.73%> (-0.21%) ⬇️
src/bindings/python/flux/job/Jobspec.py 91.21% <97.46%> (+1.03%) ⬆️

... and 9 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

python: add more parameters to Jobspec constructor/factory functions
4 participants