feat(forwarder): add SQS support as DLQ backend for failed events#1063
Merged
LorisFriedel merged 6 commits intomasterfrom Feb 17, 2026
Merged
Conversation
Add AWS SQS as an alternative to S3 for storing failed events during retry. Users can set DD_SQS_QUEUE_URL to use a pre-existing SQS queue instead of S3. The implementation uses a pluggable storage backend pattern with BaseStorage ABC, S3Storage (existing, renamed), and new SQSStorage. A factory function selects the backend based on config. SQS design: single queue with MessageAttributes for prefix separation, data chunked to fit 256KB limit, bounded polling (10 iterations), idempotent deletion via ReceiptHandle, non-matching messages released immediately via ChangeMessageVisibility. CloudFormation template updated with DdSqsQueueUrl parameter, scoped IAM permissions (ARN derived from queue URL), and HasStorageBackend condition for DD_STORE_FAILED_EVENTS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The create_storage() factory raised ValueError when neither DD_SQS_QUEUE_URL nor DD_S3_BUCKET_NAME was configured. This would crash existing deployments where retry is disabled (the default) but no storage backend is set — the old Storage() class tolerated this because S3 API errors only surfaced when storage was actually used. Fix: always fall back to S3Storage when SQS is not configured, matching the original behavior. Also add a logger.warning when a single item exceeds the SQS 240KB chunk limit, to aid debugging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reduce nesting, extract helper, and fix f-string formatting for CI lint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
751deb8 to
b5f37f1
Compare
…sage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ViBiOh
approved these changes
Feb 17, 2026
ge0Aja
reviewed
Feb 17, 2026
…lias The alias is no longer needed since all production code now uses the factory method and base class pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DD_SQS_QUEUE_URLto point to a pre-existing SQS queue; when unset, existing S3 behavior is unchangedBaseStorageABC withS3Storage(renamed fromStorage) and newSQSStoragecreate_storage()selects backend based on configurationDdSqsQueueUrlparameter, scoped IAM permissions, andHasStorageBackendconditionSQS design decisions
MessageAttributesfor prefix separation (retry_prefix,function_prefix)get_data()polls up to 10 iterations, releases non-matching messages immediatelyReceiptHandleused as key for deletion (idempotent)!Split/!Select)Operational note
When using SQS, the queue's VisibilityTimeout should be >= the Lambda function's Timeout (default 120s) to prevent duplicate processing during retries.
OBSPLTF-947