Skip to content

[BUG]: Docker@2 self-hosted parallel build hangs on DONE #21134

@mushrowan

Description

@mushrowan

New issue checklist

Task name

Docker@2

Task version

2.256.1

Issue Description

I have a pipeline stage which builds two images in parallel. After a docker daemon restart, this parallel build will work a semi-random number of times (usually about 3 times) before starting to invariably hang after the build is DONE, with the build step never being marked as complete in ADO. Temporarily fixed by restarting the docker daemon on the build agent container host, but starts to hang again after roughly 3 pipeline runs.

These builds run on the same server, on build agents in Docker containers. Docker compose file is as follows:

name: agents
services:
  agent:
    image: [redacted]/agent:latest
    restart: always
    dns: 1.1.1.1
    environment:
      - AZP_URL=[redacted]
      - AZP_TOKEN=[redacted]
      - AZP_POOL=pool
      - AZURE_DEVOPS_EXT_PAT=[redacted]
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    scale: 5

Possibly relevant is the following dockerd journald entry which seems to coincide with the errors:
http2: server: error reading preface from client @: read unix /run/docker.sock->@

The pipeline jobs that run in parallel:

      - job: bot
        steps:
          - task: Docker@2
            displayName: Build bot
            inputs:
              command: build
              arguments: --no-cache --platform linux/amd64,linux/arm64
              buildContext: $(VSProjName)
              repository: $(DockerHubUser)/bot
              tags: $(tag)
      - job: tdk
        steps:
          - task: Docker@2
            displayName: Build tdk
            inputs:
              command: build
              arguments: --no-cache --platform linux/amd64,linux/arm64
              dockerfile: '**/DockerfileTDK'
              buildContext: $(VSProjName)
              repository: $(DockerHubUser)/tdk
              tags: $(tag)

This issue only affects this build, which is the only pipeline with parallel build steps. It was consistently working previously, but stopped working in the past week or so. i'm unsure what changed to cause this issue to start occurring.

Environment type (Please select at least one enviroment where you face this issue)

  • Self-Hosted
  • Microsoft Hosted
  • VMSS Pool
  • Container

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

No response

Operation system

Ubuntu 22.04 containers running on 24.04 server (have tested running the containers on 24.04 - this does not fix the issue

Relevant log output

dockerd journald entry:
`http2: server: error reading preface from client @: read unix /run/docker.sock->@`

Pipeline logs (final block)

#19 exporting to image
#19 exporting layers
#19 exporting layers 4.7s done
#19 exporting manifest sha256:[redacted] done
#19 exporting config sha256:[redacted] done
#19 exporting attestation manifest sha256:[redacted] done
#19 exporting manifest sha256:[redacted] done
#19 exporting config sha256:[redacted] done
#19 exporting attestation manifest sha256:[redacted] done
#19 exporting manifest list sha256:[redacted] done
#19 naming to [redacted] done
#19 unpacking to [redacted]
#19 unpacking to [redacted] 0.7s done
#19 DONE 5.5s

Full task logs with system.debug enabled

No response

Repro steps

Unsure how to reproduce

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions