Skip to content

OperationInterruptedException or TaskCanceledException at startup #1827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mantasaudickas opened this issue Apr 30, 2025 · 5 comments
Open
Assignees
Milestone

Comments

@mantasaudickas
Copy link

Describe the bug

I have an application running under IIS. Its one of many in our microservices architecture. But this one seems to be having issues at startup when consumers are not created.
So far I was able to identify 2 exceptions which are happening somewhere inside RabbitMQ.Client (or in whatever place).
What my application does at startup:

  1. creates a connection to server
  2. gets a list of consumer it needs to create
  3. goes through the list (one by one)
  4. gets a channel from pool (always from the same connection)
  5. checks exchange, queue and bindings (using ExchangeDeclareAsync, QueueDeclareAsync, QueueBindAsync)
  6. starts consumer
  7. repeats from 4. with next consumer until all are configured

However one service gets TaskCanceledException at random code places, sometimes inside ExchangeDeclareAsync, sometimes inside QueueDeclareAsync, sometimes inside QosAsync, basically in any of async methods which is part of RabbitMQ.Client.
I double checked - the cancellation token which I pass to these methods remains uncancelled.

Here is example stack trace:

System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at RabbitMQ.Client.Impl.Channel.QueueDeclareAsync(String queue, Boolean durable, Boolean exclusive, Boolean autoDelete, IDictionary`2 arguments, Boolean passive, Boolean noWait, CancellationToken cancellationToken)
   at RabbitMQ.Client.Impl.AutorecoveringChannel.QueueDeclareAsync(String queue, Boolean durable, Boolean exclusive, Boolean autoDelete, IDictionary`2 arguments, Boolean passive, Boolean noWait, CancellationToken cancellationToken)
   at ESCID.ESP.Messaging.RabbitMQ.Server.Hosting.RabbitMqServerManager.TryDeclareQueueAsync(RabbitMqChannelPool channels, ILogger logger, String queueName, Boolean autoDelete, Dictionary`2 arguments, CancellationToken cancellationToken)

Sometimes in a bit more rare situations I get another exception like:

RabbitMQ.Client.Exceptions.OperationInterruptedException: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='An attempt was made to transition a task to a final state when it had already completed.', classId=0, methodId=0, exception=System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetResult(TResult result)
   at RabbitMQ.Client.Impl.SimpleAsyncRpcContinuation.DoHandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.AsyncRpcContinuation`1.HandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.Channel.HandleCommandAsync(IncomingCommand cmd, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ProcessFrameAsync(InboundFrame frame, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ReceiveLoopAsync(CancellationToken mainLoopCancellationToken)
   at RabbitMQ.Client.Framing.Connection.MainLoop()
 ---> System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at System.Threading.Tasks.TaskCompletionSource`1.SetResult(TResult result)
   at RabbitMQ.Client.Impl.SimpleAsyncRpcContinuation.DoHandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.AsyncRpcContinuation`1.HandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.Channel.HandleCommandAsync(IncomingCommand cmd, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ProcessFrameAsync(InboundFrame frame, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ReceiveLoopAsync(CancellationToken mainLoopCancellationToken)
   at RabbitMQ.Client.Framing.Connection.MainLoop()
   --- End of inner exception stack trace ---
   at RabbitMQ.Client.Impl.Channel.OpenAsync(CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at RabbitMQ.Client.Impl.RecoveryAwareChannel.CreateAndOpenAsync(ISession session, CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.AutorecoveringConnection.CreateChannelAsync(CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at ESCID.ESP.Messaging.RabbitMQ.Connections.Channels.RabbitMqChannelPool.Get(String channelOwner, CancellationToken cancellationToken)

The last one seems like a multithreading issue. However I double checked - I have await in every code part here, so whole code is executed sequentially and awaited until completes.

Behavior can be quite reliably replicated during deployment to the server, but I was not able to reproduce it on my local machine or other developers machines. Machine which is deployed is also on slower ones, which might suggest some timing, thread waiting issues or whatever.

However maybe that OperationInterruptedException can get you some clues - how this can happen and how it can be fixed?

Channels are not reused between threads in app. There are around 32 consumers created (and also about 32 channels - one for each consumer). I am also made sure that application is not stopping just after start.. so its something else which cancelling thread.

Reproduction steps

Not able to reproduce on any other machine except one server.

Expected behavior

Exception should not happen.

Additional context

No response

@lukebakken lukebakken removed the bug label Apr 30, 2025
@lukebakken lukebakken self-assigned this Apr 30, 2025
@lukebakken lukebakken added this to the 7.1.3 milestone Apr 30, 2025
@lukebakken
Copy link
Collaborator

Some questions:

  • What version of this library are you using?
  • What is logged by RabbitMQ when these exceptions happen, if anything?
RabbitMQ.Client.Exceptions.OperationInterruptedException: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='An attempt was made to transition a task to a final state when it had already completed.', classId=0, methodId=0, exception=System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.

The above exception could happen if this library receives the result of an RPC operation from RabbitMQ after the Task that is waiting for that result times out (continuation timeout). Though, I wouldn't expect to see this when opening a channel.

The other TaskCanceledException instances are probably due to the Task timing out before receiving a response from RabbitMQ.

the cancellation token which I pass to these methods remains uncancelled

You should check the status of the CancellationToken property of the exception. My guess is that it will be cancelled.

When you call a method like QueueDeclareAsync, the passed-in cancellation token is combined with another token that is used to time out the method if a response from RabbitMQ is not received in time. My guess is that it is the latter token that is cancelling.

This does all seem to be the result of a very slow environment. The TaskCanceledException is not something I'm concerned about now. I will investigate the InvalidOperationException since that is probably a bug. I will re-add that label if it turns out to be.

@mantasaudickas
Copy link
Author

mantasaudickas commented May 2, 2025

Sorry, for missing information:

  • RabbitMQ.Client - 7.1.2
  • logs: you mean the ones created by RabbitMQ.Client using Event Source? Sadly since it was made internal sealed - I do not log these anymore.

While it does not look like slow startup.. probably it depends on internal cancellation token timeout. How long it is?

You should check the status of the CancellationToken property of the exception. My guess is that it will be cancelled.

Yes - that one is cancelled, but I its not mine :D As you mentioned - probably its the one created internally by RabbitMQ.Client.

@lukebakken
Copy link
Collaborator

logs: you mean the ones created by RabbitMQ.Client using Event Source?

No, what does RabbitMQ log when these exceptions occur, if anything?

While it does not look like slow startup.. probably it depends on internal cancellation token timeout. How long it is?

It is set by ContinuationTimeout. The default value is 20 seconds, though, so if you have operations timing out you are probably overloading your RabbitMQ broker.

Sadly since it was made internal sealed - I do not log these anymore

The test suite uses this technique to output logs that originate via EventSource. Does this not work for you?

@mantasaudickas
Copy link
Author

Its a bit hard to find out what's on RabbitMQ server, as we have really a lot of clients.
Thanks for pointing out ContinuationTimeout property and Event listener.. I will try to use these to improve our code.

@lukebakken lukebakken modified the milestones: 7.1.3, 7.2.0 May 12, 2025
@stawr93
Copy link

stawr93 commented Jun 6, 2025

@lukebakken, I think I got the same error:

RabbitMQ.Client.Exceptions.OperationInterruptedException: The AMQP operation was interrupted: AMQP close-reason, initiated by Library, code=541, text='An attempt was made to transition a task to a final state when it had already completed.', classId=0, methodId=0, exception=System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at RabbitMQ.Client.Impl.SimpleAsyncRpcContinuation.DoHandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.AsyncRpcContinuation`1.HandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.Channel.HandleCommandAsync(IncomingCommand cmd, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ProcessFrameAsync(InboundFrame frame, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ReceiveLoopAsync(CancellationToken mainLoopCancellationToken)
   at RabbitMQ.Client.Framing.Connection.MainLoop()
 ---\u003e System.InvalidOperationException: An attempt was made to transition a task to a final state when it had already completed.
   at RabbitMQ.Client.Impl.SimpleAsyncRpcContinuation.DoHandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.AsyncRpcContinuation`1.HandleCommandAsync(IncomingCommand cmd)
   at RabbitMQ.Client.Impl.Channel.HandleCommandAsync(IncomingCommand cmd, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ProcessFrameAsync(InboundFrame frame, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.Connection.ReceiveLoopAsync(CancellationToken mainLoopCancellationToken)
   at RabbitMQ.Client.Framing.Connection.MainLoop()
   --- End of inner exception stack trace ---
   at RabbitMQ.Client.Impl.Channel.OpenAsync(CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at RabbitMQ.Client.Impl.RecoveryAwareChannel.CreateAndOpenAsync(ISession session, CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
   at RabbitMQ.Client.Framing.AutorecoveringConnection.CreateChannelAsync(CreateChannelOptions createChannelOptions, CancellationToken cancellationToken)
<client application stacktrace goes here>

And here is RMQ log messages at a time exception occur:

Image

RabbitMQ.Client version is 7.1.2

If you'll need anything else -- let me know, I'll try to provide as much information as you need. Hope this will help to fix the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants