Skip to content

Conversation

@benoit-nexthop
Copy link
Contributor

Thrift RPC Latency Bug

Summary

Thrift RPC calls experience multi-second latency (5-7 seconds) when they should complete in milliseconds. The root cause is that EventBase::runInLoop() adds callbacks to loopCallbacks_ without waking up the EventBase if it's blocked in eb_event_base_loop().

Environment

  • folly EventBase with libevent backend
  • fbthrift with Rocket protocol
  • WriteBatcher for batching writes

Reproducer

Here is minimal test case to be built in the FBOSS OSS tree demonstrates the issue:

/*
 * Minimal reproducer for Thrift RPC latency issue.
 * Expected: RPC call should complete in < 10ms
 * Observed: RPC call takes several seconds due to WriteBatcher callback delay
 */

#include <chrono>
#include <iostream>
#include <memory>
#include <thread>

#include <folly/init/Init.h>
#include <folly/io/async/AsyncSocket.h>
#include <folly/io/async/EventBase.h>
#include <thrift/lib/cpp2/async/RocketClientChannel.h>
#include <thrift/lib/cpp2/util/ScopedServerInterfaceThread.h>

#include "fboss/agent/if/gen-cpp2/FbossCtrl.h"

using namespace facebook::fboss;
using namespace apache::thrift;

// Simple handler that returns a fixed timestamp
class SimpleHandler : public FbossCtrlSvIf {
 public:
  void getConfigAppliedInfo(ConfigAppliedInfo& info) override {
    info.lastAppliedInMs() = 12345;
    info.lastColdbootAppliedInMs() = 0;
  }
};

int main(int argc, char** argv) {
  folly::init(&argc, &argv);

  std::cout << "Creating Thrift server..." << std::endl;

  // Use ScopedServerInterfaceThread which handles startup/shutdown properly
  auto handler = std::make_shared<SimpleHandler>();
  ScopedServerInterfaceThread server(handler, "::1", 0);

  auto port = server.getAddress().getPort();
  std::cout << "Server listening on port " << port << std::endl;

  // Create client using ScopedServerInterfaceThread's helper
  auto client = server.newClient<FbossCtrlAsyncClient>(
      nullptr, RocketClientChannel::newChannel);

  std::cout << "Making RPC call..." << std::endl;

  // Measure RPC latency
  auto start = std::chrono::steady_clock::now();

  ConfigAppliedInfo result;
  client->sync_getConfigAppliedInfo(result);

  auto end = std::chrono::steady_clock::now();
  auto duration =
      std::chrono::duration_cast<std::chrono::milliseconds>(end - start);

  std::cout << "RPC call completed in " << duration.count() << " ms"
            << std::endl;
  std::cout << "Result: lastAppliedInMs = " << *result.lastAppliedInMs()
            << std::endl;

  if (duration.count() > 100) {
    std::cout << "ERROR: RPC took too long! Expected < 100ms" << std::endl;
  } else {
    std::cout << "OK: RPC latency is acceptable" << std::endl;
  }

  return duration.count() > 100 ? 1 : 0;
}

Save this file to fboss/cli/fboss2/test/thrift_latency_test.cpp and add this at the end of cmake/CliFboss2Test.cmake:

add_executable(thrift_latency_test
  fboss/cli/fboss2/test/thrift_latency_test.cpp
)

target_link_libraries(thrift_latency_test
  ctrl_cpp2
  Folly::folly
  FBThrift::thriftcpp2
)

Before the fix this test binary takes 5-7 seconds to execute, after the fix it takes a few milliseconds.

Root Cause Analysis

The Problem

When a Thrift RPC response is ready to be sent, the WriteBatcher calls runInLoop(this, true) to schedule the write. The thisIteration=true parameter indicates the callback should run in the current loop iteration.

However, the callback is only added to runOnceCallbacks_ if we're currently inside runLoopCallbacks(). Otherwise, it's added to loopCallbacks_ for the next iteration.

The issue occurs when:

  1. The EventBase is inside eb_event_base_loop() (blocked in epoll_wait)
  2. A socket event triggers a callback that calls runInLoop()
  3. runOnceCallbacks_ is null (we're not in runLoopCallbacks())
  4. The callback is added to loopCallbacks_
  5. After the socket callback returns, eb_event_base_loop() continues blocking
  6. The callback sits in loopCallbacks_ until epoll_wait times out (~5-6 seconds typically)

Code Flow (Before Fix)

eb_event_base_loop(EVLOOP_ONCE)  <- blocking in epoll_wait
  |
  +-> socket event fires
  |     |
  |     +-> WriteBatcher::enqueueWrite()
  |           |
  |           +-> runInLoop(this, true)
  |                 |
  |                 +-> runOnceCallbacks_ == nullptr
  |                 +-> loopCallbacks_.push_back(callback)  <- callback queued
  |
  +-> epoll_wait continues blocking  <- NO WAKEUP!
  |
  ... 5-6 seconds pass ...
  |
  +-> epoll_wait times out
  +-> eb_event_base_loop returns
  |
runLoopCallbacks()  <- finally processes the callback

Instrumentation

I sprinkled a ton of debug logging in folly and fbthrift to debug the issue:

EventBase.cpp

  • Log when entering eb_event_base_loop (blocking vs non-blocking)
  • Log when runInLoop adds to loopCallbacks_ vs runOnceCallbacks_
  • Log when runLoopCallbacks starts/ends

WriteBatcher-inl.h

  • Log when enqueueWrite calls runInLoop

Reproduction (Without Fix)

Debug Output

Excerpt of the output from the reproduction binary shared above with the extra debug logging I added:

[WRITEBATCHER tid=X] enqueueWrite: calling runInLoop (immediate)
[EVENTBASE tid=X] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=1)
... 6 seconds of silence on this thread ...
[EVENTBASE tid=X] loopMain: eb_event_base_loop returned 0
[EVENTBASE tid=X] runLoopCallbacks: starting with 2 callbacks

(See below for captured strace output showing the 6-second epoll_wait)

The Fix

Add a call to eb_event_base_loopbreak() when adding to loopCallbacks_ while the loop is running:

void EventBase::runInLoop(LoopCallback* callback, bool thisIteration, ...) {
  // ... existing code ...
  if (runOnceCallbacks_ != nullptr && thisIteration) {
    runOnceCallbacks_->push_back(*callback);
  } else {
    loopCallbacks_.push_back(*callback);
    // NEW: Wake up the EventBase if it's blocked in eb_event_base_loop()
    if (isRunning()) {
      evb_->eb_event_base_loopbreak();
    }
  }
}

This causes eb_event_base_loop() to return immediately, allowing the loop to process the pending callbacks. This if/else sequence was duplicated so rather than fixing this in the same way in two different places, I've factored this out in a private helper.

Results

Based on thrift_latency_test.cpp above, running on AMD EPYC 9655P:

Metric Before Fix After Fix
RPC Latency ~6500 ms ~1 ms
Test Duration ~6.5 seconds ~6 ms

Captured Traces

Before Fix (Issue Reproduction)

strace output

key events for thread 1910218, the server IO thread:

# Thread 1910218 returns from a short 6ms epoll_wait with an event
[pid 1910218] 13:50:40.269315 epoll_wait(10, [...], 32, 6) = 1

# While processing that event, WriteBatcher adds callbacks to loopCallbacks_
[WRITEBATCHER tid=1910218] enqueueWrite: calling runInLoop (immediate)
[EVENTBASE tid=1910218] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=1)
[EVENTBASE tid=1910218] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=0)

# PROBLEM: libevent goes back into epoll_wait INSIDE eb_event_base_loop!
# It doesn't know about our loopCallbacks_ - only folly knows about those.
[pid 1910218] 13:50:40.270073 epoll_wait(10,  <unfinished ...>

# 5 seconds later, epoll_wait times out (timeout was 5042ms)
[pid 1910218] 13:50:45.317251 <... epoll_wait resumed>[], 32, 5042) = 0

# NOW eb_event_base_loop returns and we finally process the callbacks
[EVENTBASE tid=1910218] loopMain: eb_event_base_loop returned 0
[EVENTBASE tid=1910218] runLoopCallbacks: starting with 2 callbacks

Debug output

Making RPC call...
[WRITEBATCHER tid=1911408] enqueueWrite: calling runInLoop (immediate)
[EVENTBASE tid=1911408] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=1)
[WRITEBATCHER tid=1911408] enqueueWrite: calling runInLoop (immediate)
[EVENTBASE tid=1911408] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=1)
RPC call completed in 7062 ms
ERROR: RPC took too long! Expected < 100ms

Observations

  1. WriteBatcher calls runInLoop(this, true): The thisIteration=1 parameter indicates
    the callback should run immediately in the current loop iteration.

  2. Callback goes to loopCallbacks_ instead of runOnceCallbacks_: The log shows
    runOnceCallbacks_=(nil) because we're NOT inside runLoopCallbacks() at this point.
    We're inside a libevent callback (epoll event handler), so the callback is queued.

  3. libevent goes back into epoll_wait after processing the event: This is the bug.
    The flow is:

    • eb_event_base_loop(EVLOOP_ONCE) calls epoll_wait
    • An event fires, epoll_wait returns
    • libevent calls the event handler (which triggers WriteBatcher)
    • WriteBatcher adds callbacks to folly's loopCallbacks_
    • Event handler returns to libevent
    • libevent goes back into epoll_wait because it doesn't know about folly's loopCallbacks_!
    • The 5042ms timeout expires
    • eb_event_base_loop finally returns
    • folly's runLoopCallbacks() processes the 2 queued callbacks
  4. The ~5 second delay is a single epoll_wait timeout: The timeout of 5042ms is set by
    libevent based on the next timer event. The callbacks sit in loopCallbacks_ for the
    entire duration because libevent has no way to know they're waiting.


After Fix

strace output

# WriteBatcher enqueues callback - now with eb_event_base_loopbreak() called
[pid 1939305] 13:53:45.777230 epoll_wait(26, [WRITEBATCHER tid=1939302] enqueueWrite: calling runInLoop (immediate)
[pid 1939305] 13:53:45.780801 epoll_wait(26, [WRITEBATCHER tid=1939302] enqueueWrite: calling runInLoop (immediate)

# RPC completes in just 8ms!
RPC call completed in 8 ms
OK: RPC latency is acceptable

Debug output

Making RPC call...
[WRITEBATCHER tid=1939834] enqueueWrite: calling runInLoop (immediate)
[EVENTBASE tid=1939834] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=1)
[WRITEBATCHER tid=1939834] enqueueWrite: calling runInLoop (immediate)
[EVENTBASE tid=1939834] runInLoop: adding to loopCallbacks_ (runOnceCallbacks_=(nil), thisIteration=1)
RPC call completed in 1 ms
OK: RPC latency is acceptable

Observations

  1. Same callback path: The callback still goes to loopCallbacks_ (since runOnceCallbacks_
    is still null), but now we also call eb_event_base_loopbreak().

  2. Immediate wakeup: The eb_event_base_loopbreak() call causes epoll_wait to return
    immediately, allowing the EventBase to process the pending callback.

  3. RPC latency reduced from ~6 seconds to ~1-8ms: this helps all OSS FBOSS users keep their sanity.

  4. No timeout waiting: The epoll_wait no longer blocks for the full timeout period.
    The loop breaks immediately and processes callbacks.


Conclusion

The fix adds a single call to eb_event_base_loopbreak() when adding callbacks to
loopCallbacks_ while the loop is running. This ensures the EventBase wakes up promptly
to process the callback instead of waiting for the next event or timeout.

This is a fix for any code path that uses runInLoop() with thisIteration=true
while inside an epoll callback handler, as the callback will otherwise be delayed until
the next event loop timeout (typically 5-7 seconds).

It has been reported that this issue has not been observed inside Meta, and I cannot explain why.

When `runInLoop()` is called while the EventBase loop is running (but not
inside `runLoopCallbacks`), the callback is added to `loopCallbacks_` but
the loop may be blocked in `epoll_wait` with a long timeout. This causes
significant latency as the callback won't be processed until the timeout
expires or another event occurs.

Fix this by calling `eb_event_base_loopbreak()` when adding callbacks to
`loopCallbacks_` while the loop is running. This breaks out of `epoll_wait`
so the callback can be processed promptly in the next loop iteration.

This is particularly important for Thrift clients where responses arrive
on a different thread and are queued via `runInLoop()`. Without this fix,
each RPC call could be delayed by up to the epoll timeout (e.g., 30s).

In OSS FBOSS this fixes the 5-7s delay seen on every Thrift call.  Those
can add up quickly, for example FBOSS CLI end-to-end tests now take 1.4s
instead of ~155s without this fix.
@meta-cla meta-cla bot added the CLA Signed label Jan 16, 2026
@benoit-nexthop
Copy link
Contributor Author

I previously attempted to work around this issue in FBOSS with facebook/fboss#628, which contained a commit facebook/fboss@b66e380 that used server.setIdleTimeout() to make the event loop not wait so long, but I felt like this wasn't the right fix, and now I know why.

@benoit-nexthop
Copy link
Contributor Author

benoit-nexthop commented Jan 16, 2026

I amended my commit just above to change the condition from if (isRunning()) { to if (isRunning() && !inRunningEventBaseThread()) { because of failing unit tests, however now I realize that in doing so the fix becomes ineffective, because the calls are always coming from an EventBase thread. Maybe the unit tests need to be adjusted accordingly.

A typical call stack looks like this:

Thread 11 (Thread 0x7f4e50ff9640 (LWP 2941198) "ThriftIO0"):
#0  folly::EventBase::addLoopCallback (this=this@entry=0x7f4e3c000f80, callback=warning: RTTI symbol not found for class 'folly::EventBaseAtomicNotificationQueue<std::variant<apache::thrift::QueueReplyInfo, apache::thrift::StreamReplyInfo, apache::thrift::SinkConsumerReplyInfo, apache::thrift::TilePromiseReplyInfo, apache::thrift::BiDiStreamReplyInfo>, apache::thrift::ReplyInfoConsumer>' ..., thisIteration=<optimized out>) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/io/async/EventBase.cpp:867
#1  0x0000000000b61527 in folly::EventBase::runInLoop (this=0x7f4e3c000f80, callback=0x7f4e4803fb00, thisIteration=false, rctx=std::shared_ptr<folly::RequestContext> (empty) = {...}) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/io/async/EventBase.cpp:884
#2  0x0000000000cd754b in folly::EventBaseAtomicNotificationQueue<std::variant<apache::thrift::QueueReplyInfo, apache::thrift::StreamReplyInfo, apache::thrift::SinkConsumerReplyInfo, apache::thrift::TilePromiseReplyInfo, apache::thrift::BiDiStreamReplyInfo>, apache::thrift::ReplyInfoConsumer>::execute() ()
#3  0x0000000000cd6e78 in non-virtual thunk to folly::EventBaseAtomicNotificationQueue<std::variant<apache::thrift::QueueReplyInfo, apache::thrift::StreamReplyInfo, apache::thrift::SinkConsumerReplyInfo, apache::thrift::TilePromiseReplyInfo, apache::thrift::BiDiStreamReplyInfo>, apache::thrift::ReplyInfoConsumer>::handlerReady(unsigned short) ()
#4  0x0000000000b68aca in folly::EventHandler::libeventCallback (fd=<optimized out>, events=2, arg=0x7f4e4803fb28) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/io/async/EventHandler.cpp:160
#5  0x00007f4e629dd918 in event_persist_closure (ev=<optimized out>, base=0x7f4e3c001290) at /usr/src/debug/libevent-2.1.12-8.el9.x86_64/event.c:1623
#6  event_process_active_single_queue (base=base@entry=0x7f4e3c001290, activeq=0x7f4e3c0016e0, max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0) at /usr/src/debug/libevent-2.1.12-8.el9.x86_64/event.c:1682
#7  0x00007f4e629df267 in event_process_active (base=0x7f4e3c001290) at /usr/src/debug/libevent-2.1.12-8.el9.x86_64/event.c:1783
#8  event_base_loop (base=0x7f4e3c001290, flags=<optimized out>) at /usr/src/debug/libevent-2.1.12-8.el9.x86_64/event.c:2006
#9  0x0000000000b60806 in folly::EventBase::loopMain (this=this@entry=0x7f4e3c000f80, flags=flags@entry=0, options=...) at /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/unique_ptr.h:191
#10 0x0000000000b6052d in folly::EventBase::loopBody (this=this@entry=0x7f4e3c000f80, flags=flags@entry=0, options=options@entry=...) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/io/async/EventBase.cpp:540
#11 0x0000000000b604db in folly::EventBase::loop (this=0x7f4e3c000f80) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/io/async/EventBase.cpp:501
#12 0x0000000000b6156f in folly::EventBase::loopForever (this=0x7f4e3c000f80) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/io/async/EventBase.cpp:811
#13 0x0000000000cc772a in folly::IOThreadPoolExecutor::threadRun (this=0x34d43a40, thread=...) at /var/FBOSS/tmp_bld_dir/repos/github.com-facebook-folly.git/folly/executors/IOThreadPoolExecutor.cpp:252
#14 0x0000000000bac982 in std::__invoke_impl<void, void (folly::ThreadPoolExecutor::*&)(std::shared_ptr<folly::ThreadPoolExecutor::Thread>), folly::ThreadPoolExecutor*&, std::shared_ptr<folly::ThreadPoolExecutor::Thread>&> (__f=<optimized out>, __t=<optimized out>, __args=...) at /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:74
#15 0x00007f4e62090ae4 in std::execute_native_thread_routine (__p=0x7f4e4803e8d0) at ../../../../../libstdc++-v3/src/c++11/thread.cc:82
#16 0x00007f4e61d3f2fa in start_thread (arg=<optimized out>) at pthread_create.c:443
#17 0x00007f4e61dc3624 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

So I'm undoing this amendment for now.

meta-codesync bot pushed a commit to facebook/fboss that referenced this pull request Jan 22, 2026
Summary:
This diff just added the experimental thrift_latency_test.cpp from facebook/folly#2559

With the oss build, this simple thrift call can take 6418ms
```
[root@0eb3f90dd235 fboss]# ./thrift_latency_test
Creating Thrift server...
I0116 22:30:33.090906 42865 ThriftServer.cpp:2060] Using randomly generated TLS ticket keys.
I0116 22:30:33.091259 42865 ThriftServer.cpp:988] Using resource pools on address/port [::1]:0: thrift flag: true, enable gflag: false, disable gflag: false, runtime actions:
I0116 22:30:33.091567 42865 ThriftServer.cpp:1538] Resource pools configured: 2
I0116 22:30:33.092444 42865 ThriftServer.cpp:746] ThriftServer listening on address/port: [::1]:37999
Server listening on port 37999
Making RPC call...
RPC call completed in 6418 ms
Result: lastAppliedInMs = 12345
ERROR: RPC took too long! Expected < 100ms
```

While in our internal build, the same test without using any FBOSS internal logic only take 16ms
```
❯❯❯ buck2 run fbcode//fboss/cli/fboss2/test:thrift_latency_test
Buck UI: https://www.internalfb.com/buck2/fc447728-e42e-45da-8e9b-a29ae80abeb5
Network: Up: 0B  Down: 0B
Command: run.
Time elapsed: 0.0s
BUILD SUCCEEDED - starting your binary
Creating Thrift server...
I0116 14:35:48.140404 328824 ThriftServer.cpp:2060] Using randomly generated TLS ticket keys.
I0116 14:35:48.142755 328824 ThriftServer.cpp:988] Using resource pools on address/port [::1]:0: thrift flag: true, enable gflag: false, disable gflag: false, runtime actions:
I0116 14:35:48.145723 328824 ThriftServer.cpp:1538] Resource pools configured: 2
I0116 14:35:48.281738 328824 ThriftServer.cpp:746] ThriftServer listening on address/port: [::1]:40581
Server listening on port 40581
Making RPC call...
RPC call completed in 16 ms
Result: lastAppliedInMs = 12345
OK: RPC latency is acceptable
```

This is a huge difference b.w OSS build and internal build: **6418ms vs 16ms**
As you can see in thrift_latency_test.cpp, we used `ScopedServerInterfaceThread` to spawn a simple fboss agent thrift server, then use `FbossCtrlAsyncClient` with `RocketClientChannel` to call a thrift function.
No FBOSS internal logic is used here.

I am suspecting there's performance discrepancy b/w our internal build and OSS build for fbthrift.

Based on the new comments from https://fb.workplace.com/groups/560979627394613/permalink/3541014286057784/
I asked Metamate to generate a similar test but use EpollBackend and IoUring to create a thrift server

Reviewed By: kevin645

Differential Revision: D90894498

fbshipit-source-id: c8d04d6434c7aca326bc33133d40f514ea8252bb
@benoit-nexthop
Copy link
Contributor Author

Related FBOSS change for context: facebook/fboss@43100ad — switching away from libevent to the epoll based backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant