Intermittent 'Error: Connection is closed' during shutdown (BullMQ + ioredis)

## Upstream report: intermittent "Error: Connection is closed" during shutdown (BullMQ + ioredis)

Target: bullmq repository (https://github.com/taskforcesh/bullmq)

Short summary

We observe an intermittent unhandled rejection "Error: Connection is closed." during shutdown in our test runs. The stack points into ioredis internals but the clients involved are created by BullMQ (Queue/Worker/QueueEvents) using a shared ioredis connection. Tests pass but the test runner (Vitest) exits with non-zero status because of the unhandled rejection.

Why we think BullMQ may be involved

- Our repro creates `Queue`, `Worker` and `QueueEvents` (BullMQ 5.63.0 observed) with a shared ioredis connection and then closes them, then disconnects the connection. The unhandled rejection appears while those resources are being closed.
- Debug output shows multiple internal ioredis clients with names such as `bull:<base>` being created and used concurrently.

Repro and logs (attached in our repo)

- Minimal repro script (in our repo): `apps/api/scripts/repro-ioredis-shutdown.js`
- Aggregated repro runs (debug): `apps/api/test-results/repro-ioredis-shutdown.log`
- Vitest debug run where the unhandled rejection reproduced: `apps/api/test-results/vitest-debug.log`

Observed environment (from debug traces)

- BullMQ: 5.63.0
- ioredis: 5.8.2
- Node.js: v22.x
- OS: Linux

What we observe

- Intermittent unhandled rejection with stack trace inside ioredis `event_handler.js` when closing connections. Example stack snippet:

```
Unhandled Rejection
Error: Connection is closed.
 ❯ close .../node_modules/ioredis/built/redis/event_handler.js:214:25
 ❯ Socket.<anonymous> .../node_modules/ioredis/built/redis/event_handler.js:181:20
```

What we've tried in our codebase

- Attach defensive `client.on('error')` handlers to the explicit connection and to discovered internal clients created by BullMQ.
- Track and remove those handlers on shutdown.
- Prefer `connection.disconnect()` during shutdown (avoid QUIT writes), and call it after closing worker/events/queue.
- Add a test-only Vitest `unhandledRejection` swallow as a temporary mitigation while investigating.

Notes and hypotheses

- The failure looks timing sensitive: our minimal repro run 20x did not reproduce the error, but a full Vitest run with DEBUG logs reproduced it once.
- Possible root causes:
  - BullMQ may be issuing Redis commands (internal clients) after the shared connection is being closed.
  - ioredis may be emitting an error from a low-level handler that isn't being routed to the attached `error` listeners in time.

Request / suggested next steps for maintainers

1. Review the attached repro script and logs and try running the repro under a test runner to exercise the same timing (Vitest/Jest) — it may be intermittent and time-sensitive.
2. Look at internal client lifecycle in BullMQ: consider whether internal clients can do late writes while the shared connection is being disconnected, or whether a more explicit shutdown order is needed.
3. If helpful, we can try a small patch in our repo to force explicit `disconnect()` calls on internal clients before `connection.disconnect()` and report back.

If maintainers prefer the issue to be opened on ioredis instead, we can move the repro there — please advise which repo is the right owner for this race.

-- repo: sergioaafreitas/octaanalysis
-- repro script: `apps/api/scripts/repro-ioredis-shutdown.js`
-- logs: `apps/api/test-results/repro-ioredis-shutdown.log` and `apps/api/test-results/vitest-debug.log`

Latest attempts (local mitigation added in repo)

- We added an aggressive local mitigation in `apps/api/src/queue/queue.service.ts` that:
  - discovers internal ioredis clients created by BullMQ and attaches error listeners,
  - and on shutdown explicitly calls `disconnect()` on those discovered internal clients and removes listeners before disconnecting the shared connection.
- After applying this mitigation we ran a single Vitest run with debug enabled and saved the output to `apps/api/test-results/vitest-after-mitigation.log`. The unhandled rejection still appeared in that run and Vitest exited with non-zero.

Attachments in the repo now include:

- `apps/api/test-results/vitest-after-mitigation.log` (Vitest run after local mitigation)

We can open the issue now and attach the three logs (repro loop, vitest-debug, vitest-after-mitigation) to help triage.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Intermittent 'Error: Connection is closed' during shutdown (BullMQ + ioredis) #3546

Upstream report: intermittent "Error: Connection is closed" during shutdown (BullMQ + ioredis)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Intermittent 'Error: Connection is closed' during shutdown (BullMQ + ioredis) #3546

Description

Upstream report: intermittent "Error: Connection is closed" during shutdown (BullMQ + ioredis)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions