Monitoring message queue performance with Sentry #69320
Replies: 3 comments 4 replies
-
Exciting! Been tracing manually with |
Beta Was this translation helpful? Give feedback.
-
Trying to understand the distinction between this and your mockups. Does this mean we can see published vs processed in any time period, but not total count in the queue at any one time? Is there a meaningful distinction between total and published - processed? Trying to think if that would matter to us, I don't think it does! Re: infrastructure/libraries. We use Azure Service Bus. Library wise we use .NET and We use both Queues and Topics and subscriptions, but the latter a lot more https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-queues-topics-subscriptions#topics-and-subscriptions Would you say your area of focus is more about Queues, or is your definition of a Queue loose enough to encapsulate both? |
Beta Was this translation helpful? Give feedback.
-
Was pleasantly surprised to see this feature just as we started incorporating a tiny number of background tasks via queues into one of our systems. We're using the auto-instrumentation for celery, but have a slight edge case that I'm trying to work through. We've got a new celery worker that will be the consumer, and is set with a sentry trace sampling rate of 1 (100%); but our producers (web servers) are sampling at 0.02 (2%), which I think is leading to a mismatch in Sentry's produced vs consumed stats. We want to effectively trace all messages/queued tasks, but not trace all HTTP requests fully. Is there a way to do this currently? I suppose I was hoping for some kind of SENTRY_QUEUES_SAMPLE_RATE equivalent of the SENTRY_TRACES_SAMPLE_RATE, but I don't think that exists yet? We're dealing with really low volumes and it looks like Sentry is on the cusp of being able to give us "good enough" queue monitoring without having to do any hard work on bespoke monitoring stuff. But that's maybe only true if we can track producing+consuming to the same level (ideally 100% for our volumes). It would also be extremely valuable to us if we could configure alerts should messages get stuck in the queue (or effectively, if production exceeds consumption over some time period). I don't think I can see anything in Sentry that would let us set up this kind of alert yet? Would love to be told I'm wrong though. |
Beta Was this translation helpful? Give feedback.
-
Messaging systems, such as Celery, or SQS, are a useful abstraction. However, they can be challenging to reason about:
We’re working on a feature at Sentry that makes your application’s interactions with message queues easier to reason about…
Queue performance insights
These are some of the metrics we felt would be valuable for identifying queue performance problems:
Request for feedback
If you build applications that use messaging systems, we’d love your feedback on this feature.
To get the ball rolling 🏀 …
Mockups
Queue overview page
This is the first page that you’ll see when you click on Queue insights. In this view, you will see a high level summary of queue performance across all the queues and topics used by your system (depending on what queue you’re using, this may be a topic, or queue name).
Destination summary
This page provides similar metrics to the Queue overview page, but broken down by Destination. The Queue overview page helps you see the behaviour of your queues at 30’000’, the Destination Summary page allows you to better understand specific queue performance issues.
Transaction Overlay
Producer overlay
Consumer overlay
The Transaction Overlay, along with showing transaction level metrics, displays a sample of spans representing messages that were recently either written to the queue, or processed from the queue.
For both consumers and producers, you can dig deeper by clicking the Span ID and viewing the trace tied to the queue operation.
Looking forward to people’s feedback in this discussion,
— @bcoe
Beta Was this translation helpful? Give feedback.
All reactions