Skip to content

Commit 8e07925

Browse files
committed
[#26979] docdb: Disable load balancing for PgSharedMemTest.LongRead
Summary: Issue: `PgCronTest.CancelJobOnLeaderChange` might fail when new system tablets are added on tserver - This can occur, for example, when: * Enabling `ysql_yb_enable_advisory_locks`, creates an advisory lock system table * Enabling `ysql_enable_auto_analyze_infra`, creates a stateful service table. The test performs a long-running read by sleeping for `FLAGS_TEST_transactional_read_delay_ms` (65 seconds) before executing the actual read. During this period, the tablet leader may move to a different node because the newly added system tablets trigger load balancing. After the sleep completes, the system detects the leader change and retries the read on the new leader. However, since each long read takes over 65 seconds and the read timeout is 120 seconds, two consecutive long reads can exceed the timeout limit, resulting in test failure. Fix: Disable load balancing during the test to prevent unexpected leader moves during long reads. Jira: DB-16439 Test Plan: ./yb_build.sh --cxx-test pgwrapper_pg_shared_mem-test --gtest_filter PgSharedMemTest.LongRead Reviewers: sergei, rthallam Reviewed By: sergei, rthallam Subscribers: rthallam, ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D43567
1 parent 3469f64 commit 8e07925

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

src/yb/yql/pgwrapper/pg_shared_mem-test.cc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828

2929
using namespace std::literals;
3030

31+
DECLARE_bool(enable_load_balancing);
3132
DECLARE_bool(pg_client_use_shared_memory);
3233
DECLARE_bool(TEST_pg_client_crash_on_shared_memory_send);
3334
DECLARE_bool(TEST_skip_remove_tserver_shared_memory_object);
@@ -198,6 +199,13 @@ class PgSharedMemBigTimeoutTest : public PgSharedMemTest {
198199
};
199200

200201
TEST_F_EX(PgSharedMemTest, LongRead, PgSharedMemBigTimeoutTest) {
202+
// Disable load balancing, as tablet leader move might happen during the long-running read
203+
// by load balancer and causing test to fail, here is the steps:
204+
// 1. Perform long read and start sleep FLAGS_TEST_transactional_read_delay_ms (65 seconds)
205+
// 2. During this time, the tablet leader is moved by the load balancer
206+
// 3. After the 65s sleep, it detects the leader change retries the read on the new leader
207+
// 4. The retried read also sleeps for 65 seconds. Combined, the total read time exceeds the 120s
208+
ANNOTATE_UNPROTECTED_WRITE(FLAGS_enable_load_balancing) = false;
201209
auto conn = ASSERT_RESULT(Connect());
202210

203211
ASSERT_OK(conn.Execute("CREATE TABLE t (key INT PRIMARY KEY) SPLIT INTO 1 TABLETS"));

0 commit comments

Comments
 (0)