Skip to content

allow fifo sqs listeners to keep processing messages in case of an error on a single message #1415

@davide-imbriaco

Description

@davide-imbriaco

Type: Feature

Is your feature request related to a problem? Please describe.
Whe have a simple @SqsListener annotated method that process messages from a FIFO sqs queue. The configuration is all default, so it is processing batches of 10 messages at a time. For this kind of messages we have a visibility timeout of 15minutes configured on the FIFO queue on aws.
When the queue is working at full capacity (for example with 100+ messages visible in queue), if an error occours while processing one message, this mechanism kicks in:

  1. the failed message is acknowledged and deleted on aws
  2. the next messages in the same batch are discarded ( NOT acknowleged and NOT deleted on aws )
  3. due to how visibility timeout works on sqs fifo queue, processing halts, and has to wait for those discarded messages to timeout on aws before they become visible again and are processed (15minutes for our use case).

The consequence is that when processing a sequence of messages with many invalid messages (that cause processing errors) the processing is very slow, way slower than it should be. This could be a regression affecting latest spring aws sqs releases.

Describe the solution you'd like
The current behavior, as per docs, is this

If processing fails for a message, the following messages from the same message group are discarded so they will be served again after their message visibility expires.

This is bad for our use case, we would like to have the following messages to be either processed right away or returned to the queue for immediate processing (I think this can be done by setting their visibility timeout to zero with a ChangeMessageVisibility call). This could be an optional configuration on @SqsListener, something like @SqsListener( onErrorBatchProcessingStrategy = DISCARD_REMAINING_MESSAGES/PROCESS_REMAINING_MESSAGES ) (default discard to keep backwards compatibility)

Describe alternatives you've considered
When using a batch size of 1 the problem does not occour, but performance would be worse.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions