Skip to content

Conversation

marisn
Copy link
Contributor

@marisn marisn commented Sep 2, 2025

This commit is a major overhaul of the mapcache_seed utility, addressing critical bugs in signal handling, multiprocessing, and thread synchronization that could cause deadlocks or unresponsive behavior.

The key changes are:

  1. Robust Signal Handling and Shutdown:

The Ctrl+C (SIGINT) handling has been completely rewritten to ensure a prompt and clean shutdown in all modes, fixing numerous deadlocks and race conditions.

  • For Multiprocessing:

    • pop_queue() now checks a sig_int_received flag at the start to ensure child processes exit immediately after finishing their current task.
    • push_queue()'s EINTR retry loop now also checks the signal flag, preventing the parent process from deadlocking on a full message queue after a signal has been received.
  • For Multithreading:

    • On SIGINT, the feed_worker now calls apr_queue_interrupt_all() to wake all blocked threads. pop_queue() correctly handles the resulting APR_EOF to stop the worker.
    • A race condition where a thread could start a new job just as a signal was received has been fixed by re-checking the signal flag after a work item is successfully popped from the queue.
  • Graceful vs. Urgent Shutdown:

    • The final loop in feed_worker() that sends STOP commands is now wrapped in an if(!sig_int_received) block. This ensures it only runs during a normal, graceful shutdown and is skipped on Ctrl+C, preventing a major deadlock.
  1. Multiprocessing Enhancements:

The -p mode has been significantly improved for stability and correctness.

  • IPC-based Logging: A new, dedicated System V message queue (log_msqid) has been implemented for logging in multiprocessing mode. This decouples logging from the main work queue, improving performance and stability. log_thread_fn and seed_worker were updated to use this new mechanism.
  • -p 1 Bugfix: A long-standing bug that caused the seeder to stall when using -p 1 has been fixed by changing all relevant logic guards from nprocesses > 1 to nprocesses >= 1.
  • IPC Creation: Message queues are now created using IPC_PRIVATE instead of ftok, which is more robust.
  1. Code Quality and Minor Fixes:
  • child_init is correctly called at start of each worker process after fork().
  • sig_int_received is now correctly typed as volatile sig_atomic_t.
  • memset() is now used to initialize structs before use.
  • The help text for the -p option has been clarified.

Co-authored with gemini-2.5-pro

This commit is a major overhaul of the mapcache_seed utility, addressing critical bugs in signal handling,
multiprocessing, and thread synchronization that could cause deadlocks or unresponsive behavior.

The key changes are:

  1. Robust Signal Handling and Shutdown:

  The Ctrl+C (SIGINT) handling has been completely rewritten to ensure a prompt and clean shutdown in all modes, fixing
  numerous deadlocks and race conditions.

   - For Multiprocessing:
     - pop_queue() now checks a sig_int_received flag at the start to ensure child processes exit immediately after
       finishing their current task.
     - push_queue()'s EINTR retry loop now also checks the signal flag, preventing the parent process from deadlocking on
       a full message queue after a signal has been received.

   - For Multithreading:
     - On SIGINT, the feed_worker now calls apr_queue_interrupt_all() to wake all blocked threads. pop_queue() correctly
       handles the resulting APR_EOF to stop the worker.
     - A race condition where a thread could start a new job just as a signal was received has been fixed by re-checking
       the signal flag after a work item is successfully popped from the queue.

   - Graceful vs. Urgent Shutdown:
     - The final loop in feed_worker() that sends STOP commands is now wrapped in an if(!sig_int_received) block. This
       ensures it only runs during a normal, graceful shutdown and is skipped on Ctrl+C, preventing a major deadlock.

  2. Multiprocessing Enhancements:

  The -p mode has been significantly improved for stability and correctness.

   - IPC-based Logging: A new, dedicated System V message queue (log_msqid) has been implemented for logging in
     multiprocessing mode. This decouples logging from the main work queue, improving performance and stability.
     log_thread_fn and seed_worker were updated to use this new mechanism.
   - `-p 1` Bugfix: A long-standing bug that caused the seeder to stall when using -p 1 has been fixed by changing all
     relevant logic guards from nprocesses > 1 to nprocesses >= 1.
   - IPC Creation: Message queues are now created using IPC_PRIVATE instead of ftok, which is more robust.

  3. Code Quality and Minor Fixes:

   - child_init is correctly called at start of each worker process after fork().
   - sig_int_received is now correctly typed as volatile sig_atomic_t.
   - memset() is now used to initialize structs before use.
   - The help text for the -p option has been clarified.

Co-authored with gemini-2.5-pro
@marisn
Copy link
Contributor Author

marisn commented Sep 2, 2025

Sorry for large blob, but it was hard to understand causes and best fixes for all dead-locking or use of garbage values cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant