-
Notifications
You must be signed in to change notification settings - Fork 24
PBS job manager #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
PBS job manager #151
Conversation
adam-azarchs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This looks pretty good other than a few minor nit-picks.
jobmanagers/pbs_queue.py
Outdated
| import json | ||
|
|
||
| # PBS Pro "job states" to be regarded as "alive" | ||
| ALIVE = {'Q', 'H', 'W', 'S', 'R', 'E'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be simply
| ALIVE = {'Q', 'H', 'W', 'S', 'R', 'E'} | |
| ALIVE = "QHWSRE" |
in will still work on it the same way and, for a set that small, will probably be faster than hash lookup.
jobmanagers/pbs_queue.py
Outdated
| """Gets the command line for qstat.""" | ||
| if not ids: | ||
| sys.exit(0) | ||
| return ['qstat', '-x', '-F', 'json', '-f'] + ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please format this script with black for consistency with the other scripts.
|
@johnyaku thank you for the patch! |
|
Thanks for the review @macklin-10x |
|
@adam-azarchs can you please take a look over the changes and see if this is ready to merge? |
adam-azarchs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My remaining comments are minor nitpicks. If you have a chance to address them soon, great, otherwise I'll merge it either way before we cut another release.
| # is sufficient, but this can often be reduced to 4 hours or less if | ||
| # '--maxjobs' is at least 10. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it's probably fine to set 4 hours regardless, this comment is misleading. The wall time being set in this template is for each individual job, not for the overall pipeline run. --maxjobs controls the number of jobs that can be queued simultaneously; increasing it may improve the overall pipeline wall time but would not be expected to improve the runtime for each individual job; more likely in fact would be higher I/O contention which would increase the runtime for these jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen any jobs run for more than an hour in cluster mode, but happy to be educated here. My naive assumption was that --maxjobs affected the number (and size) of the chunks when input data is split. If I'm wrong then you can revert the second half of this change.
My reason for reducing this is that on our HPC, and perhaps PBS Pro systems more generally, account quota is "reserved" based on resources X walltime. We often run dozens of cellranger or spaceranger jobs in parallel and so requesting too much walltime can result in all of our quota getting reserved, even though we only get billed based on actual usage. Hogging quota like this can impact the rest of the team.
Perhaps
24 hours (24:00:00) is sufficient but this can be adjusted by comparing against actual usage
In our system, PBS Pro dumps actual usage information to the _stdout files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--maxjobs controls a semaphore limiting the number of jobs which are queued to the cluster at a time. It does not affect the size or number of jobs in total. Together with --jobinterval, the intent is to prevent a single pipeline job from slamming a cluster hard enough to make sysadmins angry at our users.
I have not previously encountered a cluster that charged based on reserved wall time as opposed to actually-used wall time; if you have that then more careful accounting is certainly in order! You can get the actual wall time (as measured by the job itself, so might be a slight underestimate as far as the cluster is concerned) from the _perf json file at the end of the run, though care must be taken when interpreting what that means in anything but the leaf nodes of the graph.
In our internal infrastructure, we usually run on spot instances in AWS, which are prone to involuntary preemption, so we try to aim to keep individual jobs under 1 hour (for most datasets anyway) to avoid loosing too much work when one of them fails this way and needs to be restarted. I think in most cases there will be very few jobs more than 15 minutes, even, depending on the performance of your hardware (the slow stages tend to be mostly I/O limited).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. We only get billed for actual usage, but potential usage gets "reserved" until the jobs finish. 1000 jobs (from multiple parallel runs) all claiming 24 hours walltime can result in all our quota getting reserved, which prevents new jobs from queuing, breaking the pipeline orchestrating the parallel runs.
This PR closes #150 and
Here
pbs_queue.pyborrows heavily fromslurm_queue.pyandsge_queue.py, with adaptions for PBS Pro.These changes have been successfully tested on the Gadi HPC system on the National Compute Infrastructure in Australia by running cellranger 9.0.1 on 125 captures.
Checks
pylint: ✅pyformat: ✅