Skip to content

job-runner: timeout task does not cancel if other tasks hang #7917

@jelly

Description

@jelly

When podman run --init gets stuck the timeout task does not "kill the job runner" as other tasks might get stuck in gather_and_cancel waiting on task.cancel() to complete:

For example:

await task <Task cancelling name='Task-4' coro=<run_container() running at /home/jelle/projects/cockpit-bots/lib/aio/job.py:125> wait_for=<Future cancelled>>
DEBUG:lib.aio.spawn:run(['podman', 'rm', '--force', '--time=0', '--cidfile=/tmp/tmpw4b91s99/cidfile'])
DEBUG:lib.aio.spawn:run: waiting for pid 64905
Error: reading CIDFile: open /tmp/tmpw4b91s99/cidfile: no such file or directory
DEBUG:lib.aio.spawn:run: pid 64905 exited, 125
DEBUG:lib.aio.spawn:spawn: waiting for pid 64890

Waiting for pid 64890 may take time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions