Skip to content

The job ID 0 infinite loop, deep troubleshooting and solution / suggestion #3518

@MikaelX

Description

@MikaelX

Job ID 0 Infinite Loop with Complex Redis Connection Configuration

Issue Summary

When using BullMQ with complex Redis connection configurations, workers can get stuck in an infinite loop trying to process a job with ID 0, causing continuous Unknown code -1 error for 0. moveToFinished errors.

Environment

  • BullMQ Version: 5.61.2 (also affects 4.x versions)
  • Node.js Version: 18.x
  • Redis Version: 7.x
  • Operating System: Linux (WSL2)

Problem Description

Symptoms

  • Worker starts successfully but immediately begins processing job ID 0
  • Continuous error messages: Unknown code -1 error for 0. moveToFinished
  • Worker never processes actual jobs
  • Infinite loop that cannot be stopped without killing the process

Root Cause

The issue occurs when the Redis connection configuration includes multiple conflicting settings that interfere with BullMQ's internal marker processing system. Specifically:

  1. Complex connection pooling settings that conflict with BullMQ's operations
  2. Multiple retry delay configurations that confuse BullMQ's retry logic
  3. Connection optimization settings that interfere with BullMQ's marker handling
  4. Missing or incorrect maxRetriesPerRequest setting

Reproduction Steps

❌ Problematic Configuration (Causes Infinite Loop)

import { Worker } from 'bullmq'
import IORedis from 'ioredis'

// This configuration causes the job ID 0 infinite loop
const redis = new IORedis({
  host: 'localhost',
  port: 6379,
  db: 1,
  maxRetriesPerRequest: 3, // ❌ WRONG - should be null
  
  // These additional settings cause conflicts
  lazyConnect: true,
  keepAlive: 30000,
  connectTimeout: 10000,
  commandTimeout: 5000,
  retryDelayOnFailover: 100,
  enableReadyCheck: false,
  maxRedirections: 16,
  enableOfflineQueue: false,
  enableAutoPipelining: true,
  maxLoadingTimeout: 5000,
  family: 4,
  keyPrefix: '',
  retryDelayOnClusterDown: 300,
  retryDelayOnFailover: 100,
  stringNumbers: true,
  dropBufferSupport: false
})

const worker = new Worker('test-queue', async (job) => {
  console.log(`Processing job: ${job.name} (ID: ${job.id})`)
  return { success: true }
}, {
  connection: redis,
  prefix: 'myapp'
})

✅ Working Configuration (No Issues)

import { Worker } from 'bullmq'
import IORedis from 'ioredis'

// This configuration works perfectly
const redis = new IORedis({
  host: 'localhost',
  port: 6379,
  db: 1,
  maxRetriesPerRequest: null // ✅ CORRECT - required for BullMQ
})

const worker = new Worker('test-queue', async (job) => {
  console.log(`Processing job: ${job.name} (ID: ${job.id})`)
  return { success: true }
}, {
  connection: redis,
  prefix: 'myapp'
})

Investigation Details

What We Discovered

  1. BullMQ 3.15.0 works perfectly - No job ID 0 issues
  2. BullMQ 4.x+ has the issue - Occurs with complex Redis configurations
  3. Clean test projects work - Simple Redis config works fine
  4. The issue is configuration-specific - Not a fundamental BullMQ bug

Technical Analysis

  • The addBaseMarkerIfNeeded Lua script creates a marker with ID "0" in Redis
  • With complex Redis configurations, BullMQ incorrectly processes this marker as a job
  • The marker should only be used as a wake-up signal, not processed as actual job data
  • Simple Redis configurations allow BullMQ to handle markers correctly

Expected Behavior

  • Worker should start cleanly and wait for actual jobs
  • No job ID 0 processing
  • No infinite error loops
  • Worker should process real jobs normally

Actual Behavior

  • Worker immediately tries to process job ID 0
  • Continuous Unknown code -1 error for 0. moveToFinished errors
  • Infinite loop that prevents normal operation
  • Worker never processes actual jobs

Workaround

Use a minimal Redis connection configuration with only essential settings:

const redis = new IORedis({
  host: 'localhost',
  port: 6379,
  db: 1,
  maxRetriesPerRequest: null // Critical for BullMQ
})

Suggested Solution

  1. Document the minimal Redis configuration requirements in BullMQ documentation
  2. Add validation warnings when complex Redis configurations are detected
  3. Improve error handling for job ID 0 scenarios to prevent infinite loops
  4. Add configuration examples showing what works vs. what doesn't

Additional Information

  • This issue affects both BullMQ 4.x and 5.x versions
  • The problem is not in BullMQ's core functionality but in how it handles complex Redis configurations
  • Simple Redis configurations work perfectly across all BullMQ versions
  • The issue is particularly common when migrating from older Redis connection patterns

Related

  • This issue is related to Redis connection configuration best practices
  • Similar issues may occur with other Redis optimization libraries
  • The problem highlights the importance of letting BullMQ handle its own optimizations

Note: This issue was discovered during production troubleshooting and has been resolved by simplifying the Redis connection configuration. The fix has been tested and verified to work correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions