Skip to content

Allow min worker queue length of 0 #9116

@super-ibby

Description

@super-ibby

My use case for Dask is primarily for orchestrating the execution of long running functions. By long running, I mean minutes, and sometimes hours. Tasks can be submitted to a cluster in waves, eventually saturating all workers, and causing some shorter running tasks to get queued behind very long-running tasks. Worker stealing is moot given all workers are saturated.

I would like to prevent eager assignment of ready tasks to worker queues, allowing tasks to build up on the scheduler. Currently, the minimum worker queue achievable is 1 (i.e, via a worker-saturation setting <= 1.0)). This appears to be controlled via distributed.scheduler._task_slots_available():

def _task_slots_available(ws: WorkerState, saturation_factor: float) -> int:
    """Number of tasks that can be sent to this worker without oversaturating it"""
    assert not math.isinf(saturation_factor)
    return max(math.ceil(saturation_factor * ws.nthreads), 1) - (
        len(ws.processing) - len(ws.long_running)
    )

All I ask is to expose a setting to allow a floor of 0 here. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImprove existing functionality or make things work better

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions