Skip to content

Commit

Permalink
docs: Add ADR for polling loop (#41)
Browse files Browse the repository at this point in the history
Remaining work on #31
  • Loading branch information
timmc-edx authored Sep 7, 2022
1 parent 2807fe6 commit 7dad050
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 4 deletions.
32 changes: 32 additions & 0 deletions docs/decisions/0007-producer-polling.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
7. Polling the producer
#######################

Status
******

Accepted

Context
*******

Polling the producer is required in order to trigger delivery callbacks that are configured by the application. These callbacks are not triggered until the event is sent, ack'd, and finally polled. Even if callbacks are not configured, it may be necessary to call ``poll`` in order to burn down the delivery report queue.

The `recommendation`_ in the librdkafka docs is to call ``poll(0)`` before or after any event production, which is a parasitic approach ensuring that we're triggering callbacks at least as often as we're producing events. The events getting callbacks will _not_ be the ones just produced, since they won't have been acknowledged yet. However, if events are infrequent, it doesn't ensure that callbacks happen in a timely fashion, and the last event emitted before shutdown would never get a delivery callback.

The documentation offers a second strategy of just running a polling loop in a separate thread. This could be in addition to or instead of the parasitic polling. We don't have any important delivery callbacks configured at the moment, and it's not clear whether we ever will, or whether they'll be delay-sensitive—so it wasn't clear whether we should bother setting up a polling loop.

.. _recommendation: https://github.com/edenhill/librdkafka/wiki/FAQ#when-and-how-should-i-call-rd_kafka_poll

Decision
********

Even though it's not clear whether we'll need it, we'll run a separate thread with a polling loop, just since it's not that much additional code and we'll _probably_ want it.

We'll continue using the parasitic polling as well since it has the advantage of automatically adjusting the polling rate to the production rate without having to fine-tune the looping interval.

This is implemented in version 0.6.1.

Consequences
************

We have some code that we're not sure is needed, but at least shouldn't hurt.
6 changes: 2 additions & 4 deletions edx_event_bus_kafka/internal/producer.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,10 +225,8 @@ def poll_indefinitely(api_weakref: EventProducerKafka):
The thread stops automatically once the producer is garbage-collected.
This ensures that callbacks are triggered in a timely fashion, rather than waiting
for the poll() call that we make before or after each produce() call. This may be
important if events are produced infrequently, and it allows the last event the
server emits before shutdown to have its callback run (if it happens soon enough.)
See ADR for more information:
https://github.com/openedx/event-bus-kafka/blob/main/docs/decisions/0007-producer-polling.rst
"""
# The reason we hold a weakref to the whole EventProducerKafka and
# not directly to the Producer itself is that you just can't make
Expand Down

0 comments on commit 7dad050

Please sign in to comment.