Skip to content

resubmit to queue with new job resources #361

@msiron-entalpic

Description

@msiron-entalpic

One useful feature would be to resubmit something to the queue with new resources. Sometimes on SLURM clusters, we realize after submitting that some partition are full but some partitions are available. Currently we can do something like this to update a job resources:

jf job set resources -did 541 '{"cpus_per_task": 56, "partition": "new_partition"}'

However, this will not go through if the job is already submitted to slurm and just awaiting for SLURM to schedule it. And we might not be able to edit the slurm partition if the requested resources differ between partition. Instead jobflow would return an error similar to:

[11:15:29] ERROR    Error while setting for job 541                                                                                                                                                       
                    ValueError: Job in state UPLOADED. The action cannot be performed    

or:

[11:18:55] ERROR    Error while setting for job 541                                                                                                                                                       
                    ValueError: Job in state CHECKED_OUT. The action cannot be performed    

It would be nice to have a feature which re-submits a SLURM job with new resource allocation if its already waiting to be scheduled/submitted. However this might involve cancelling the appropriate SLURM job in conjunction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions