[scheduler] fix scheduling behavior of batch job allocs #26961

chrisroberts · 2025-10-18T00:46:57Z

Description

Allocations of batch jobs have two specific behaviors documented:

First, on node drain, the allocation is allowed to complete unless
the deadline is reached at which point the allocation is killed. The
allocation is note replaced.

Second, when using the alloc stop command, the allocation is
stopped and then rescheduled according to its reschedule policy.

This update removes the change introduced in dfa07e1 (#26025)
that forced batch job allocations into a failed state when
migrating. The reported issue it was attempting to resolve was
itself incorrect behavior. The reconciler has been adjusted
to properly handle batch job allocations as documented.

An important addition to note: new eval trigger reason

EvalTriggerAllocReschedule

This is added to provide better information to the user. It is
shown an explained in the last examples below.

Testing & Reproduction steps

batch jobspec

job "sleep-job" {
  type = "batch"

  group "sleeper" {
    count = 5

    reschedule {
      attempts       = 3
      interval       = "15m"
      delay          = "4m"
      delay_function = "constant"
      max_delay      = "5m"
      unlimited      = false
    }

    ephemeral_disk {
      size = 10
    }

    task "do_sleep" {
      driver = "raw_exec"

      logs {
        disabled      = true
        max_files     = 1
        max_file_size = 1
      }

      config {
        command = "sleep"
        args    = ["1d"]
      }

      resources {
        memory = 10
        cpu    = 5
      }
    }

    task "extra_sleep" {
      driver = "raw_exec"

      logs {
        disabled      = true
        max_files     = 1
        max_file_size = 1
      }

      config {
        command = "sleep"
        args    = ["2d"]
      }

      resources {
        memory = 10
        cpu    = 5
      }
    }
  }
}

Behavior on main

alloc stop command

This shows the behavior of the alloc stop command on a batch job allocation. The job is started and then a single allocation is stopped:

➜ nomad run sleep.hcl

==> View this job in the Web UI: http://10.86.244.24:4646/ui/jobs/sleep-job@default

==> 2025-10-17T17:51:06-07:00: Monitoring evaluation "40250ff8"
    2025-10-17T17:51:06-07:00: Evaluation triggered by job "sleep-job"
    2025-10-17T17:51:07-07:00: Allocation "71d6882e" created: node "0e569f27", group "sleeper"
    2025-10-17T17:51:07-07:00: Allocation "8e671f60" created: node "0e569f27", group "sleeper"
    2025-10-17T17:51:07-07:00: Allocation "c72be233" created: node "b0dccea3", group "sleeper"
    2025-10-17T17:51:07-07:00: Allocation "ca3f8856" created: node "6c4fcb70", group "sleeper"
    2025-10-17T17:51:07-07:00: Allocation "421b7a60" created: node "b0dccea3", group "sleeper"
    2025-10-17T17:51:07-07:00: Evaluation status changed: "pending" -> "complete"
==> 2025-10-17T17:51:07-07:00: Evaluation "40250ff8" finished with status "complete"

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-17T17:51:06-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        0       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
421b7a60  b0dccea3  sleeper     0        run      running  3s ago   2s ago
71d6882e  0e569f27  sleeper     0        run      running  3s ago   2s ago
8e671f60  0e569f27  sleeper     0        run      running  3s ago   2s ago
c72be233  b0dccea3  sleeper     0        run      running  3s ago   2s ago
ca3f8856  6c4fcb70  sleeper     0        run      running  3s ago   2s ago

➜ nomad alloc stop 42
==> 2025-10-17T17:51:31-07:00: Monitoring evaluation "855d8b1a"
    2025-10-17T17:51:31-07:00: Evaluation triggered by job "sleep-job"
    2025-10-17T17:51:32-07:00: Allocation "8b1af122" created: node "6c4fcb70", group "sleeper"
    2025-10-17T17:51:32-07:00: Evaluation status changed: "pending" -> "complete"
==> 2025-10-17T17:51:32-07:00: Evaluation "855d8b1a" finished with status "complete"

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-17T17:51:06-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        1       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
8b1af122  6c4fcb70  sleeper     0        run      running  3s ago   2s ago
421b7a60  b0dccea3  sleeper     0        stop     failed   29s ago  3s ago
71d6882e  0e569f27  sleeper     0        run      running  29s ago  28s ago
8e671f60  0e569f27  sleeper     0        run      running  29s ago  28s ago
c72be233  b0dccea3  sleeper     0        run      running  29s ago  28s ago
ca3f8856  6c4fcb70  sleeper     0        run      running  29s ago  28s ago

Here we can see the result of the alloc stop command is the allocation is stopped in a failed state and the allocation is immediately replaced. The desired behavior here is that the allocation should be stopped with a complete status, and the allocation should be rescheduled based on the reschedule policy.

drain behavior

This shows the behavior of a node drain on batch job allocations. The job is started and then a single node is drained with a one second deadline:

➜ nomad run sleep.hcl

==> View this job in the Web UI: http://10.86.244.24:4646/ui/jobs/sleep-job@default

==> 2025-10-17T17:58:19-07:00: Monitoring evaluation "28b04ae3"
    2025-10-17T17:58:19-07:00: Evaluation triggered by job "sleep-job"
    2025-10-17T17:58:20-07:00: Allocation "8841e305" created: node "6c4fcb70", group "sleeper"
    2025-10-17T17:58:20-07:00: Allocation "de029dc7" created: node "6c4fcb70", group "sleeper"
    2025-10-17T17:58:20-07:00: Allocation "f33973b8" created: node "0e569f27", group "sleeper"
    2025-10-17T17:58:20-07:00: Allocation "2d9fb037" created: node "b0dccea3", group "sleeper"
    2025-10-17T17:58:20-07:00: Allocation "733eb34d" created: node "b0dccea3", group "sleeper"
    2025-10-17T17:58:20-07:00: Evaluation status changed: "pending" -> "complete"
==> 2025-10-17T17:58:20-07:00: Evaluation "28b04ae3" finished with status "complete"

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-17T17:58:19-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        0       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
2d9fb037  b0dccea3  sleeper     0        run      running  4s ago   3s ago
733eb34d  b0dccea3  sleeper     0        run      running  4s ago   3s ago
8841e305  6c4fcb70  sleeper     0        run      running  4s ago   3s ago
de029dc7  6c4fcb70  sleeper     0        run      running  4s ago   3s ago
f33973b8  0e569f27  sleeper     0        run      running  4s ago   3s ago


➜ nomad node drain -enable -yes -deadline 1s b0
2025-10-17T17:58:36-07:00: Ctrl-C to stop monitoring: will not cancel the node drain
2025-10-17T17:58:36-07:00: Node "b0dccea3-ab06-6141-474b-05f5892f72b8" drain strategy set
2025-10-17T17:58:38-07:00: Alloc "2d9fb037-5c72-786b-21c2-5e0938463f53" marked for migration
2025-10-17T17:58:38-07:00: Alloc "733eb34d-a409-6469-1245-8607a8c57804" marked for migration
2025-10-17T17:58:38-07:00: Drain complete for node b0dccea3-ab06-6141-474b-05f5892f72b8
2025-10-17T17:58:38-07:00: Alloc "2d9fb037-5c72-786b-21c2-5e0938463f53" draining
2025-10-17T17:58:38-07:00: Alloc "733eb34d-a409-6469-1245-8607a8c57804" draining
2025-10-17T17:58:39-07:00: Alloc "2d9fb037-5c72-786b-21c2-5e0938463f53" status running -> failed
2025-10-17T17:58:39-07:00: Alloc "733eb34d-a409-6469-1245-8607a8c57804" status running -> failed
2025-10-17T17:58:39-07:00: All allocations on node "b0dccea3-ab06-6141-474b-05f5892f72b8" have stopped

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-17T17:58:19-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        2       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
10065b8b  0e569f27  sleeper     0        run      running  5s ago   4s ago
9d99b920  0e569f27  sleeper     0        run      running  5s ago   4s ago
2d9fb037  b0dccea3  sleeper     0        stop     failed   25s ago  5s ago
733eb34d  b0dccea3  sleeper     0        stop     failed   25s ago  5s ago
8841e305  6c4fcb70  sleeper     0        run      running  25s ago  24s ago
de029dc7  6c4fcb70  sleeper     0        run      running  25s ago  24s ago
f33973b8  0e569f27  sleeper     0        run      running  25s ago  24s ago

The drain stops the two allocations on the node in a failed state, and immediately places two new allocations. For drains, the allocations should be stopped with a complete status and the allocations should not be replaced.

Behavior with this changeset

alloc stop command

➜ nomad run sleep.hcl

==> 2025-10-20T08:10:34-07:00: Monitoring evaluation "d89ce708"
    2025-10-20T08:10:34-07:00: Evaluation triggered by job "sleep-job"
    2025-10-20T08:10:35-07:00: Allocation "05ad7436" created: node "6c4fcb70", group "sleeper"
    2025-10-20T08:10:35-07:00: Allocation "7a1b5420" created: node "0e569f27", group "sleeper"
    2025-10-20T08:10:35-07:00: Allocation "995f5e33" created: node "b0dccea3", group "sleeper"
    2025-10-20T08:10:35-07:00: Allocation "a5fd7420" created: node "0e569f27", group "sleeper"
    2025-10-20T08:10:35-07:00: Allocation "c5c12c43" created: node "6c4fcb70", group "sleeper"
    2025-10-20T08:10:35-07:00: Evaluation status changed: "pending" -> "complete"
==> 2025-10-20T08:10:35-07:00: Evaluation "d89ce708" finished with status "complete"

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-20T08:10:34-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        0       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
05ad7436  6c4fcb70  sleeper     0        run      running  3s ago   2s ago
7a1b5420  0e569f27  sleeper     0        run      running  3s ago   2s ago
995f5e33  b0dccea3  sleeper     0        run      running  3s ago   2s ago
a5fd7420  0e569f27  sleeper     0        run      running  3s ago   2s ago
c5c12c43  6c4fcb70  sleeper     0        run      running  3s ago   2s ago

➜ nomad alloc stop 05
==> 2025-10-20T08:10:43-07:00: Monitoring evaluation "abb43bda"
    2025-10-20T08:10:43-07:00: Evaluation triggered by job "sleep-job"
    2025-10-20T08:10:44-07:00: Evaluation status changed: "pending" -> "complete"
==> 2025-10-20T08:10:44-07:00: Evaluation "abb43bda" finished with status "complete"

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-20T08:10:34-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         4        0       1         0     0

Future Rescheduling Attempts
Task Group  Eval ID   Eval Time
sleeper     63d25748  3m47s from now

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created  Modified
05ad7436  6c4fcb70  sleeper     0        stop     complete  14s ago  4s ago
7a1b5420  0e569f27  sleeper     0        run      running   14s ago  13s ago
995f5e33  b0dccea3  sleeper     0        run      running   14s ago  13s ago
a5fd7420  0e569f27  sleeper     0        run      running   14s ago  13s ago
c5c12c43  6c4fcb70  sleeper     0        run      running   14s ago  13s ago

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-20T08:10:34-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        0       1         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created    Modified
0befef56  b0dccea3  sleeper     0        run      running   3m56s ago  3m55s ago
05ad7436  6c4fcb70  sleeper     0        stop     complete  7m57s ago  7m47s ago
7a1b5420  0e569f27  sleeper     0        run      running   7m57s ago  7m56s ago
995f5e33  b0dccea3  sleeper     0        run      running   7m57s ago  7m56s ago
a5fd7420  0e569f27  sleeper     0        run      running   7m57s ago  7m56s ago
c5c12c43  6c4fcb70  sleeper     0        run      running   7m57s ago  7m56s ago

Now the allocation is stopped, in a complete state, and a new allocation hasn't immediately replaced it. Instead, the allocation has been rescheduled based on the reschedule policy as expected from the documented behavior. Once the delayed evaluation is executed, the new allocation is placed.

drain behavior

This shows the behavior of a node drain on batch job allocations. The job is started and then a single node is drained with a one second deadline:

➜ nomad run sleep.hcl

==> 2025-10-20T08:21:36-07:00: Monitoring evaluation "ad5b6d81"
    2025-10-20T08:21:36-07:00: Evaluation triggered by job "sleep-job"
    2025-10-20T08:21:37-07:00: Allocation "f7af18cc" created: node "0e569f27", group "sleeper"
    2025-10-20T08:21:37-07:00: Allocation "7386d7b1" created: node "b0dccea3", group "sleeper"
    2025-10-20T08:21:37-07:00: Allocation "8392ca41" created: node "6c4fcb70", group "sleeper"
    2025-10-20T08:21:37-07:00: Allocation "8765c6ba" created: node "6c4fcb70", group "sleeper"
    2025-10-20T08:21:37-07:00: Allocation "d647f127" created: node "b0dccea3", group "sleeper"
    2025-10-20T08:21:37-07:00: Evaluation status changed: "pending" -> "complete"
==> 2025-10-20T08:21:37-07:00: Evaluation "ad5b6d81" finished with status "complete"

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-20T08:21:36-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        0       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
7386d7b1  b0dccea3  sleeper     0        run      running  4s ago   3s ago
8392ca41  6c4fcb70  sleeper     0        run      running  4s ago   3s ago
8765c6ba  6c4fcb70  sleeper     0        run      running  4s ago   3s ago
d647f127  b0dccea3  sleeper     0        run      running  4s ago   3s ago
f7af18cc  0e569f27  sleeper     0        run      running  4s ago   4s ago

➜ nomad node drain -enable -yes -deadline 1s b0
2025-10-20T08:22:11-07:00: Ctrl-C to stop monitoring: will not cancel the node drain
2025-10-20T08:22:11-07:00: Node "b0dccea3-ab06-6141-474b-05f5892f72b8" drain strategy set
2025-10-20T08:22:13-07:00: Alloc "7386d7b1-fe02-a718-58a5-54dcd196937c" marked for migration
2025-10-20T08:22:13-07:00: Alloc "d647f127-203f-9536-56ea-5f6ee595c493" marked for migration
2025-10-20T08:22:13-07:00: Drain complete for node b0dccea3-ab06-6141-474b-05f5892f72b8
2025-10-20T08:22:14-07:00: Alloc "7386d7b1-fe02-a718-58a5-54dcd196937c" draining
2025-10-20T08:22:14-07:00: Alloc "d647f127-203f-9536-56ea-5f6ee595c493" draining
2025-10-20T08:22:14-07:00: Alloc "7386d7b1-fe02-a718-58a5-54dcd196937c" status running -> complete
2025-10-20T08:22:14-07:00: Alloc "d647f127-203f-9536-56ea-5f6ee595c493" status running -> complete
2025-10-20T08:22:14-07:00: All allocations on node "b0dccea3-ab06-6141-474b-05f5892f72b8" have stopped

➜ nomad status sleep-job
ID            = sleep-job
Name          = sleep-job
Submit Date   = 2025-10-20T08:21:36-07:00
Type          = batch
Priority      = 50
Datacenters   = *
Namespace     = default
Node Pool     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         3        0       2         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status    Created  Modified
7386d7b1  b0dccea3  sleeper     0        stop     complete  41s ago  4s ago
8392ca41  6c4fcb70  sleeper     0        run      running   41s ago  40s ago
8765c6ba  6c4fcb70  sleeper     0        run      running   41s ago  40s ago
d647f127  b0dccea3  sleeper     0        stop     complete  41s ago  4s ago
f7af18cc  0e569f27  sleeper     0        run      running   41s ago  41s ago

The drain stops the two allocations on the node in a completed state, and the allocations are not replaced. This matches the documented expected behavior.

New evaluation trigger reason

The current behavior of nomad when rescheduling an allocation is to assume the allocation being replaced has failed. When stopping an allocation, this results in an eval status with the following:

➜ nomad eval status 8dd
ID                 = 8dde8bd1
Create Time        = 24s ago
Modify Time        = 24s ago
Status             = pending
Status Description = created for delayed rescheduling
Type               = batch
TriggeredBy        = alloc-failure
Job ID             = sleep-job
Namespace          = default
...

The TriggeredBy insinuates that the eval was triggered by the allocation failing, but it was triggered by the allocation being rescheduled due to the alloc stop command. To more correctly describe the reason, the EvalTriggerAllocReschedule constant was introduced and used in this situation, which gives the value alloc-reschedule as shown below:

➜ nomad eval status 440
ID                 = 44058981
Create Time        = 10s ago
Modify Time        = 10s ago
Status             = pending
Status Description = created for delayed rescheduling
Type               = batch
TriggeredBy        = alloc-reschedule
Job ID             = sleep-job
Namespace          = default
...

Links

Fixes #26929

Contributor Checklist

Changelog Entry If this PR changes user-facing behavior, please generate and add a
changelog entry using the make cl command.
Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
ensure regressions will be caught.
Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.

Reviewer Checklist

Backport Labels Please add the correct backport labels as described by the internal
backporting document.
Commit Type Ensure the correct merge method is selected which should be "squash and merge"
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
within the public repository.

If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

Allocations of batch jobs have two specific behaviors documented: First, on node drain, the allocation is allowed to complete unless the deadline is reached at which point the allocation is killed. The allocation is note replaced. Second, when using the `alloc stop` command, the allocation is stopped and then rescheduled according to its reschedule policy. This update removes the change introduced in dfa07e1 (#26025) that forced batch job allocations into a failed state when migrating. The reported issue it was attempting to resolve was itself incorrect behavior. The reconciler has been adjusted to properly handle batch job allocations as documented.

tgross · 2025-10-20T17:57:12Z

scheduler/reconciler/filters.go

 	remaining = make(allocSet)
 	for id, alloc := range set {
-		if !alloc.ServerTerminalStatus() {
+		if (alloc.Job.Type == structs.JobTypeBatch && !alloc.DesiredTransition.ShouldReschedule()) || !alloc.ServerTerminalStatus() {


We're keeping batch allocs if they're server-terminal and don't have desired-transition reschedule. Is this because of nomad alloc stop? I don't think those allocs are actually server-terminal until after they've already been through the scheduler once.

In any case, this weird conditional could definitely use a "why" comment.

Yes, that is correct that this is because of alloc stop. Without this addition to the conditional, when the future eval is run, no allocation will be placed because any existing complete allocations will be counted for the total. Filtering out those that are marked for being rescheduled allows them to actually be placed when the eval is run.

tgross · 2025-10-20T18:05:34Z

nomad/structs/structs.go


 	if (a.DesiredStatus == AllocDesiredStatusStop && !a.LastRescheduleFailed()) ||
-		(a.ClientStatus != AllocClientStatusFailed && a.ClientStatus != AllocClientStatusLost) ||
+		(!isBatch && a.ClientStatus != AllocClientStatusFailed && a.ClientStatus != AllocClientStatusLost) ||


If I have a batch alloc that's complete, but not yet stopped on the server, this change will mean NextRescheduleTime potentially returns true for the eval where we process that update.

Adjusted this to check for rescheduled batch.

tgross · 2025-10-20T18:13:19Z

scheduler/reconciler/reconcile_cluster.go

 		as = as.filterByTerminal()
 		desiredChanges := new(structs.DesiredUpdates)
 		desiredChanges.Stop, allocsToStop = as.filterAndStopAll(a.clusterState)
+		// TODO(spox): what is with allocsToStop here? not appended, only last set returned?


Yikes, that seems wrong

Yeah, this is just a note for me to investigate a bit and spin out a separate PR.

scheduler/reconciler/filters.go

scheduler/reconciler/reconcile_cluster.go

vercel bot deployed to Preview – nomad-ui October 18, 2025 00:47 View deployment

adjust batch matching in filter

0d7e76e

vercel bot deployed to Preview – nomad-ui October 18, 2025 01:36 View deployment

tgross reviewed Oct 20, 2025

View reviewed changes

chrisroberts added 2 commits October 20, 2025 17:53

adjust test for drain behavior

b9b24d8

minor adjustments and added notes

484821b

vercel bot deployed to Preview – nomad-ui October 21, 2025 00:54 View deployment

remove todo note

879b348

vercel bot deployed to Preview – nomad-ui October 21, 2025 00:59 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[scheduler] fix scheduling behavior of batch job allocs #26961

[scheduler] fix scheduling behavior of batch job allocs #26961

Uh oh!

chrisroberts commented Oct 18, 2025 •

edited

Loading

Uh oh!

tgross Oct 20, 2025

Uh oh!

chrisroberts Oct 21, 2025

Uh oh!

tgross Oct 20, 2025

Uh oh!

chrisroberts Oct 21, 2025

Uh oh!

tgross Oct 20, 2025

Uh oh!

chrisroberts Oct 20, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[scheduler] fix scheduling behavior of batch job allocs #26961

Are you sure you want to change the base?

[scheduler] fix scheduling behavior of batch job allocs #26961

Uh oh!

Conversation

chrisroberts commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing & Reproduction steps

Behavior on main

alloc stop command

drain behavior

Behavior with this changeset

alloc stop command

drain behavior

New evaluation trigger reason

Links

Contributor Checklist

Reviewer Checklist

Changes to Security Controls

Uh oh!

tgross Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

chrisroberts Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

tgross Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

chrisroberts Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

tgross Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

chrisroberts Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chrisroberts commented Oct 18, 2025 •

edited

Loading