Skip to content

Supervisor fails to delete a network and local mode push hangs #2370

Open
@majorz

Description

@majorz

On a CI/CD system where local mode is used a few times a week balena push hangs because of the following problem with supervisor/balenaEngine:

Device state apply error Error: Failed to apply state transition steps. (HTTP code 403) unexpected - error while removing network: network <NAME> id <ID> has active endpoints  Steps:["removeNetwork","removeNetwork"]

This is an instance of moby/moby#42119

It is a problem in Docker's libnetwork where its internal state gets out of sync possibly due to some racing problem or unclean exit. This leads to Docker refusing to delete the network in question.

The only workaround that worked is restarting the docker daemon. Tried different less intrusive operations, but those did not work (docker network prune --force, docker system prune --force, or adding a minimal container, attaching the network to it and detaching it to see whether the reference count will be cleared, etc.).

Searched extensively for other possible solutions or workarounds, but none exist yet. The real fix needs to be in libnetwork, but the moby issue is stale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions