Skip to content

scheduler: incorrect scheduling of batch job allocations on drain #26929

@chrisroberts

Description

@chrisroberts

Nomad scheduling of batch job allocations is currently inconsistent with the documented behavior. From the documentation, batch job allocations should behave in the following ways:

Currently the drain behavior is not working as documented.

To document the current behavior, a cluster with 3 agents will be used along with the simple jobspec below defining a batch job:

batch jobspec
job "sleep-job" {
  type = "batch"

  group "sleeper" {
    count = 5

    ephemeral_disk {
      size = 10
    }

    task "do_sleep" {
      driver = "raw_exec"

      logs {
        disabled      = true
        max_files     = 1
        max_file_size = 1
      }

      config {
        command = "sleep"
        args    = ["1d"]
      }

      resources {
        memory = 10
        cpu    = 5
      }
    }

    task "extra_sleep" {
      driver = "raw_exec"

      logs {
        disabled      = true
        max_files     = 1
        max_file_size = 1
      }

      config {
        command = "sleep"
        args    = ["1d"]
      }

      resources {
        memory = 10
        cpu    = 5
      }
    }
  }
}

drain behavior

Running the job we get an initial status:

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        0       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
25f432b7  490b97bb  sleeper     0        run      running  3s ago   2s ago
a8eb00d4  717d40fd  sleeper     0        run      running  3s ago   2s ago
d05c5866  52a010ff  sleeper     0        run      running  3s ago   2s ago
dffa4043  490b97bb  sleeper     0        run      running  3s ago   2s ago
ec349a28  52a010ff  sleeper     0        run      running  3s ago   2s ago

Now, draining node 490b97bb with a deadline of 2s results in:

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost  Unknown
sleeper     0       0         5        2       0         0     0

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created   Modified
7ab9320e  52a010ff  sleeper     0        run      running  3s ago    2s ago
ccb7284f  717d40fd  sleeper     0        run      running  3s ago    2s ago
25f432b7  490b97bb  sleeper     0        stop     failed   2m6s ago  3s ago
a8eb00d4  717d40fd  sleeper     0        run      running  2m6s ago  2m5s ago
d05c5866  52a010ff  sleeper     0        run      running  2m6s ago  2m5s ago
dffa4043  490b97bb  sleeper     0        stop     failed   2m6s ago  2s ago
ec349a28  52a010ff  sleeper     0        run      running  2m6s ago  2m5s ago

The two allocations which were running on node 490b97bb have a status of failed and were rescheduled. The expected behavior should be the two allocations having a status of complete and not being rescheduled.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions