feat: support round-robin load balancing for routing & add retry mechanism for TSF event reporting#699
Merged
SkyeBeFreeman merged 2 commits intopolarismesh:mainfrom Mar 18, 2026
Merged
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #699 +/- ##
============================================
+ Coverage 20.43% 21.08% +0.64%
- Complexity 1037 1119 +82
============================================
Files 390 408 +18
Lines 16189 16893 +704
Branches 2088 2164 +76
============================================
+ Hits 3309 3562 +253
- Misses 12474 12918 +444
- Partials 406 413 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
变更概述
本 PR 包含两个独立功能改动:
变更详情
一、路由负载均衡支持轮询(WeightedRoundRobinBalance)
改动说明:
namespace.service固定字符串,导致路由规则过滤后实例列表变化时,轮询状态无法感知,出现负载不均衡问题generateRouteKey(List<Instance>)方法,根据当前实例列表的 ID(或host:port)排序后计算 hashCode 作为路由键二、TSF 事件上报重试机制(TsfEventReporter)
改动说明:
针对两类不同的失败场景,分别设计了独立的重试策略:
1. V1 业务失败重试(
retCode != 0)retCode != 0时,认为是业务层失败,立即在当次 HTTP 请求内同步重试,最多重试 3 次(V1_MAX_RETRY = 3)HttpPost和StringEntity,避免 entity 已被消费导致重复读取失败2. 网络异常暂停/恢复机制(
Exception)paused = true,暂停所有队列消费(V1 和 Report 队列共用同一暂停标志)retryExecutors调度器,延迟 60 秒后自动恢复消费(paused = false)commonRetryCount,最多允许重试 120 次(即最长约 2 小时持续重试)3. Report 事件(限流事件)
errorInfo非空时,认为是不可重试的业务错误,直接放弃,不重试其他改进:
BlockingQueue(LinkedBlockingQueue)升级为LinkedBlockingDeque,支持将事件放回队列头部,保证重试时的事件顺序TsfEventDataPair新增toString()方法,便于日志调试测试
WeightedRoundRobinBalanceTest- 轮询路由键生成逻辑单元测试TsfEventReporterTest- 事件上报重试机制单元测试(覆盖 V1 业务失败重试、网络异常暂停恢复等场景)