fix: batch recovery to avoid 16MB read limit#174
Conversation
Recovery was sending all stale running jobs to a single `recover` mutation, which could exceed Convex's 16MB read limit when many jobs needed recovery at once (e.g. high maxParallelism + server restart). Batch into chunks of 50, matching the pattern used for cancellations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The initial batch fix only batched the scheduled `recover` call, but `handleRecovery` inside `main` was still reading work docs for every old running job unbounded. Now it processes at most RECOVERY_BATCH_SIZE candidates per iteration and signals `main` to re-run recovery immediately if more remain. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit 96380fe.
Work documents can store arbitrarily large fnArgs, so use a conservative batch size to stay well under the 16MB read limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
handleRecovery was reading work docs for every old running job unbounded inside the main loop mutation. Now it processes at most RECOVERY_BATCH_SIZE candidates per iteration and signals main to re-run recovery immediately if more remain. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Processing only a batch per cycle is sufficient — recovery is a health check, not an emergency. Remaining old jobs will be picked up in subsequent recovery cycles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA single file modification to Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
commit: |
Summary
recovermutation, which could exceed Convex's 16MB read limit when many jobs needed recovery simultaneously (e.g. highmaxParallelism+ server restart)RECOVERY_BATCH_SIZE = 10in both the scheduledrecovercall and thehandleRecoveryreads insidemainTest plan
maxParallelismdeployment that recovery no longer hits the 16MB limit🤖 Generated with Claude Code
Summary by CodeRabbit