-
Notifications
You must be signed in to change notification settings - Fork 1.4k
如何在异步场景下使用探针Agent
DiscoveryAgent不仅适用于Discovery框架,也适用于一切具有类似使用场景的基础框架(例如:Dubbo)和业务系统
Discovery框架存在着如下全链路传递上下文的场景,包括
- 策略路由Header全链路从网关传递到服务
- 调用链埋点全链路从网关传递到服务
- 业务自定义的上下文的传递
上述上下文会在如下异步场景中丢失,包括
- WebFlux Reactor响应式异步
- Spring异步,@Async注解异步
- Hystrix线程池隔离模式异步
- 线程,线程池异步
- SLF4J日志异步
通过DiscoveryAgent,解决上述痛点。Discovery框架利用DiscoveryAgent字节码增强技术,完美解决各种调用场景下的异步,包括
- Spring Cloud Gateway过滤器中的上下文传递
- Zuul过滤器中的上下文传递
- Feign拦截器中的上下文转发
- RestTemplate拦截器中的上下文转发
- WebClient拦截器中的上下文转发
DiscoveryAgent不仅适用于Discovery框架,也适用于一切具有类似使用场景的基础框架(例如:Dubbo)和业务系统
ThreadLocal的作用是提供线程内的局部变量,在多线程环境下访问时能保证各个线程内的ThreadLocal变量各自独立。在异步场景下,由于出现线程切换的问题,例如,主线程切换到子线程,会导致线程ThreadLocal上下文丢失。DiscoveryAgent通过Java Agent方式解决这些痛点
涵盖所有Java框架的异步场景,解决如下10个异步场景下丢失线程ThreadLocal上下文的问题
- WebFlux Reactor
-
@Async - Hystrix Thread Pool Isolation
- Runnable
- Callable
- Supplier
- Single Thread
- Thread Pool
- Virtual Thread
- SLF4J MDC
需要注意,DiscoveryAgent不支持含有Lambda语法的异步代码。使用Lambda去实现的Runnable/Callable类会生成一个匿名内部类,这个匿名内部类和DiscoveryAgent使用的是不同的类加载器,导致DiscoveryAgent无法去修改Lambda表达式生成的Runnable/Callable的实现类。具体原因如下:
-
字节码生成时机问题
Lambda表达式在编译时不会生成完整的字节码,而是在运行时由JVM动态生成。Java Agent通常是在类加载时进行字节码转换,而此时Lambda表达式对应的实现类尚未生成
-
匿名类的特殊处理
Lambda表达式在底层被编译为使用invokedynamic指令和匿名类实现。这些匿名类的生成发生在JVM运行时,而不是编译时,因此Java Agent无法在类加载阶段捕获和修改这些类
-
方法句柄的复杂性
Lambda表达式依赖于方法句柄(MethodHandle)机制,这使得它们在字节码层面比普通方法调用更加复杂,难以被传统的字节码操作工具(如ASM)正确处理
-
类加载顺序问题
Lambda表达式相关的类(如LambdaMetafactory)是由引导类加载器加载的,而Java Agent通常无法修改这些由引导类加载器加载的类
某些JDK新特性的写法,可以改成如下形式,来规避Lambda表达式
CompletableFuture.runAsync(new Runnable() {
@Override
public void run() {
}
});
CompletableFuture<String> completableFuture = CompletableFuture.supplyAsync(new Supplier<String>() {
@Override
public String get() {
return "";
}
}); 插件获取方式有两种方式
- 通过https://github.com/Nepxion/DiscoveryAgent/releases下载最新版本的Discovery Agent
- 编译https://github.com/Nepxion/DiscoveryAgent产生discovery-agent目录
① discovery-agent-starter-${discovery.version}.jar为Agent引导启动程序,JVM启动时进行加载
② agent.config为基准扫描目录配置文件
绝大多数情况下不需要修改,当然使用者也可以增加和删除agent.config的基准扫描目录。默认配置如下
# Base thread scan packages
agent.plugin.thread.scan.packages=reactor.core.publisher;org.springframework.aop.interceptor;com.netflix.hystrix
基准扫描目录,含义如下
- WebFlux Reactor异步场景下的扫描目录对应为reactor.core.publisher
-
@Async场景下的扫描目录对应为org.springframework.aop.interceptor - Hystrix线程池隔离场景下的扫描目录对应为com.netflix.hystrix
③ plugin/discovery-agent-starter-plugin-strategy-${discovery.version}.jar插件,解决Nepxion Discovery上下文异步场景
④ plugin/discovery-agent-starter-plugin-mdc-${discovery.version}.jar插件,解决SLF4J MDC日志上下文异步场景
⑤ 业务系统可以自定义plugin,解决业务自己定义的上下文异步场景
① 使用示例
- 通过如下-javaagent启动,基本格式,如下
-javaagent:C:/opt/discovery-agent/discovery-agent-starter-${discovery.agent.version}.jar -Dthread.scan.packages=com.nepxion.discovery.guide.service.feign
② 参数说明
- C:/opt/discovery-agent:Agent所在的目录,需要对应到实际的目录上
-
-Dthread.scan.packages:Runnable/Callable/Thread/ThreadPool/Virtual Thread等异步类所在的扫描目录,该目录下的异步类都会被装饰- 扫描目录最好精细和准确,目录越详细,越可以减少被装饰的对象数,从一定程度上可以提高性能
- 扫描目录如果有多个,用“;”分隔
- 扫描目录如果含有“;”,可能会在某些操作系统中无法被识别,请用
""进行引入,例如,-Dthread.scan.packages="com.abc;com.xyz" - 扫描目录下没有Runnable/Callable/Thread/ThreadPool等异步类存在,那么thread.scan.packages也不需要配置,最终启动命令行简化为-javaagent:C:/opt/discovery-agent/discovery-agent-starter-${discovery.agent.version}.jar
-
-Dthread.gateway.enabled:Spring Cloud Gateway端策略Header输出到异步子线程。默认开启 -
-Dthread.zuul.enabled:Zuul端策略Header输出到异步子线程。默认开启 -
-Dthread.service.enabled:服务端策略Header输出到异步子线程。默认开启 -
-Dthread.mdc.enabled:SLF4J MDC日志输出到异步子线程。默认开启 -
-Dthread.request.decorator.enabled:异步调用场景下在服务端的Request请求的装饰,当主线程先于子线程执行完的时候,Request会被Destory,导致Header仍旧拿不到,开启装饰,就可以确保拿到。默认为开启,根据实践经验,大多数场景下,需要开启该开关
③ 安装校验
Spring Cloud 20xx版的应用上支持如下配置,一般通过-Dspring.application.strategy.agent.validation.enabled=true或者false来启动和关闭
# 启动和关闭DiscoveryAgent安装校验,一旦启动,如果未安装DiscoveryAgent,则抛错退出应用,该配置只适用于Spring Cloud 202x版。缺失则默认为true
# spring.application.strategy.agent.validation.enabled=true
IDEA DebugAgent支持Reactive Streams的Reactor调试,如果开启会使DiscoveryAgent的Reactor模块失效,所以必须关闭IDEA的Reactor调试模式
- 根据规范开发一个插件,插件提供了钩子函数,在某个类被加载的时候,可以注册一个事件到线程上下文切换事件当中,实现业务自定义ThreadLocal的跨线程传递
- plugin目录为放置需要在线程切换时进行ThreadLocal传递的自定义插件。业务自定义插件开发完后,放入到plugin目录下即可
具体步骤介绍,如下
① SDK侧工作
- 新建ThreadLocal上下文类
public class MyContext {
private static final ThreadLocal<MyContext> THREAD_LOCAL = new ThreadLocal<MyContext>() {
@Override
protected MyContext initialValue() {
return new MyContext();
}
};
public static MyContext getCurrentContext() {
return THREAD_LOCAL.get();
}
public static void clearCurrentContext() {
THREAD_LOCAL.remove();
}
private Map<String, String> attributes = new HashMap<>();
public Map<String, String> getAttributes() {
return attributes;
}
public void setAttributes(Map<String, String> attributes) {
this.attributes = attributes;
}
}② Agent侧工作
- 新建一个模块,引入如下依赖
<dependency>
<groupId>com.nepxion</groupId>
<artifactId>discovery-agent-starter</artifactId>
<version>${discovery.agent.version}</version>
<scope>provided</scope>
</dependency>- 新建一个ThreadLocalHook类继承AbstractThreadLocalHook
public class MyContextHook extends AbstractThreadLocalHook {
@Override
public Object create() {
// 从主线程的ThreadLocal里获取并返回上下文对象
return MyContext.getCurrentContext().getAttributes();
}
@Override
public void before(Object object) {
// 把create方法里获取到的上下文对象放置到子线程的ThreadLocal里
if (object instanceof Map) {
MyContext.getCurrentContext().setAttributes((Map<String, String>) object);
}
}
@Override
public void after() {
// 线程结束,销毁上下文对象
MyContext.clearCurrentContext();
}
}- 新建一个Plugin类继承AbstractPlugin
public class MyContextPlugin extends AbstractPlugin {
private Boolean threadMyPluginEnabled = Boolean.valueOf(System.getProperty("thread.myplugin.enabled", "false"));
@Override
protected String getMatcherClassName() {
// 返回存储ThreadLocal对象的类名,由于插件是可以插拔的,所以必须是字符串形式,不允许是显式引入类
return "com.nepxion.discovery.example.sdk.MyContext";
}
@Override
protected String getHookClassName() {
// 返回ThreadLocalHook类名
return MyContextHook.class.getName();
}
@Override
protected boolean isEnabled() {
// 通过外部-Dthread.myplugin.enabled=true/false的运行参数来控制当前Plugin是否生效。该方法在父类中定义的返回值为true,即缺省为生效
return threadMyPluginEnabled;
}
}- 定义SPI扩展,在src/main/resources/META-INF/services目录下定义SPI文件
名称为固定如下格式
com.nepxion.discovery.agent.plugin.Plugin
内容为Plugin类的全路径
com.nepxion.discovery.example.agent.MyContextPlugin
-
执行Maven编译,把编译后的包放在discovery-agent/plugin目录下
-
给服务增加启动参数并启动,如下
-javaagent:C:/opt/discovery-agent/discovery-agent-starter-${discovery.agent.version}.jar -Dthread.scan.packages=com.nepxion.discovery.example.application -Dthread.myplugin.enabled=true
③ Application侧工作
- 执行MyApplication,它模拟在主线程ThreadLocal放入Map数据,子线程通过DiscoveryAgent获取到该Map数据,并打印出来
@SpringBootApplication
@RestController
public class MyApplication {
private static final Logger LOG = LoggerFactory.getLogger(MyApplication.class);
public static void main(String[] args) {
SpringApplication.run(MyApplication.class, args);
invoke();
}
public static void invoke() {
RestTemplate restTemplate = new RestTemplate();
for (int i = 1; i <= 10; i++) {
restTemplate.getForEntity("http://localhost:8080/index/" + i, String.class).getBody();
}
}
@GetMapping("/index/{value}")
public String index(@PathVariable(value = "value") String value) throws InterruptedException {
Map<String, String> attributes = new HashMap<String, String>();
attributes.put(value, "MyContext");
MyContext.getCurrentContext().setAttributes(attributes);
LOG.info("【主】线程ThreadLocal:{}", MyContext.getCurrentContext().getAttributes());
new Thread(new Runnable() {
@Override
public void run() {
LOG.info("【子】线程ThreadLocal:{}", MyContext.getCurrentContext().getAttributes());
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
LOG.info("Sleep 5秒之后,【子】线程ThreadLocal:{} ", MyContext.getCurrentContext().getAttributes());
}
}).start();
return "";
}
}输出结果,如下
2020-11-09 00:08:14.330 INFO 16588 --- [nio-8080-exec-1] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{1=MyContext}
2020-11-09 00:08:14.381 INFO 16588 --- [ Thread-4] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{1=MyContext}
2020-11-09 00:08:14.402 INFO 16588 --- [nio-8080-exec-2] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{2=MyContext}
2020-11-09 00:08:14.403 INFO 16588 --- [ Thread-5] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{2=MyContext}
2020-11-09 00:08:14.405 INFO 16588 --- [nio-8080-exec-3] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{3=MyContext}
2020-11-09 00:08:14.406 INFO 16588 --- [ Thread-6] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{3=MyContext}
2020-11-09 00:08:14.414 INFO 16588 --- [nio-8080-exec-4] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{4=MyContext}
2020-11-09 00:08:14.414 INFO 16588 --- [ Thread-7] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{4=MyContext}
2020-11-09 00:08:14.417 INFO 16588 --- [nio-8080-exec-5] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{5=MyContext}
2020-11-09 00:08:14.418 INFO 16588 --- [ Thread-8] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{5=MyContext}
2020-11-09 00:08:14.421 INFO 16588 --- [nio-8080-exec-6] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{6=MyContext}
2020-11-09 00:08:14.422 INFO 16588 --- [ Thread-9] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{6=MyContext}
2020-11-09 00:08:14.424 INFO 16588 --- [nio-8080-exec-7] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{7=MyContext}
2020-11-09 00:08:14.425 INFO 16588 --- [ Thread-10] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{7=MyContext}
2020-11-09 00:08:14.427 INFO 16588 --- [nio-8080-exec-8] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{8=MyContext}
2020-11-09 00:08:14.428 INFO 16588 --- [ Thread-11] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{8=MyContext}
2020-11-09 00:08:14.430 INFO 16588 --- [nio-8080-exec-9] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{9=MyContext}
2020-11-09 00:08:14.431 INFO 16588 --- [ Thread-12] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{9=MyContext}
2020-11-09 00:08:14.433 INFO 16588 --- [io-8080-exec-10] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{10=MyContext}
2020-11-09 00:08:14.434 INFO 16588 --- [ Thread-13] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{10=MyContext}
2020-11-09 00:08:19.382 INFO 16588 --- [ Thread-4] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{1=MyContext}
2020-11-09 00:08:19.404 INFO 16588 --- [ Thread-5] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{2=MyContext}
2020-11-09 00:08:19.406 INFO 16588 --- [ Thread-6] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{3=MyContext}
2020-11-09 00:08:19.416 INFO 16588 --- [ Thread-7] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{4=MyContext}
2020-11-09 00:08:19.418 INFO 16588 --- [ Thread-8] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{5=MyContext}
2020-11-09 00:08:19.422 INFO 16588 --- [ Thread-9] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{6=MyContext}
2020-11-09 00:08:19.425 INFO 16588 --- [ Thread-10] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{7=MyContext}
2020-11-09 00:08:19.428 INFO 16588 --- [ Thread-11] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{8=MyContext}
2020-11-09 00:08:19.432 INFO 16588 --- [ Thread-12] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{9=MyContext}
2020-11-09 00:08:19.434 INFO 16588 --- [ Thread-13] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{10=MyContext}
如果不加异步Agent,则输出结果,如下,可以发现在子线程中ThreadLocal上下文全部都丢失
2020-11-09 00:01:40.133 INFO 16692 --- [nio-8080-exec-1] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{1=MyContext}
2020-11-09 00:01:40.135 INFO 16692 --- [ Thread-8] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.158 INFO 16692 --- [nio-8080-exec-2] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{2=MyContext}
2020-11-09 00:01:40.159 INFO 16692 --- [ Thread-9] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.162 INFO 16692 --- [nio-8080-exec-3] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{3=MyContext}
2020-11-09 00:01:40.163 INFO 16692 --- [ Thread-10] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.170 INFO 16692 --- [nio-8080-exec-5] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{4=MyContext}
2020-11-09 00:01:40.170 INFO 16692 --- [ Thread-11] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.173 INFO 16692 --- [nio-8080-exec-4] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{5=MyContext}
2020-11-09 00:01:40.174 INFO 16692 --- [ Thread-12] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.176 INFO 16692 --- [nio-8080-exec-6] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{6=MyContext}
2020-11-09 00:01:40.177 INFO 16692 --- [ Thread-13] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.179 INFO 16692 --- [nio-8080-exec-8] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{7=MyContext}
2020-11-09 00:01:40.180 INFO 16692 --- [ Thread-14] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.182 INFO 16692 --- [nio-8080-exec-7] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{8=MyContext}
2020-11-09 00:01:40.182 INFO 16692 --- [ Thread-15] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.185 INFO 16692 --- [nio-8080-exec-9] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{9=MyContext}
2020-11-09 00:01:40.186 INFO 16692 --- [ Thread-16] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:40.188 INFO 16692 --- [io-8080-exec-10] c.n.d.example.application.MyApplication : 【主】线程ThreadLocal:{10=MyContext}
2020-11-09 00:01:40.189 INFO 16692 --- [ Thread-17] c.n.d.example.application.MyApplication : 【子】线程ThreadLocal:{}
2020-11-09 00:01:45.136 INFO 16692 --- [ Thread-8] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.160 INFO 16692 --- [ Thread-9] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.163 INFO 16692 --- [ Thread-10] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.171 INFO 16692 --- [ Thread-11] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.174 INFO 16692 --- [ Thread-12] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.177 INFO 16692 --- [ Thread-13] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.181 INFO 16692 --- [ Thread-14] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.183 INFO 16692 --- [ Thread-15] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.187 INFO 16692 --- [ Thread-16] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
2020-11-09 00:01:45.190 INFO 16692 --- [ Thread-17] c.n.d.example.application.MyApplication : Sleep 5秒之后,【子】线程ThreadLocal:{}
完整示例,请参考https://github.com/Nepxion/DiscoveryAgent/tree/master/discovery-agent-example。上述自定义插件的方式,即可解决使用者在线程切换时丢失ThreadLocal上下文的问题
2017-2050 ©Nepxion Studio Apache License
- 如何对接Foundation基础平台实施收敛集成
- 如何对接DevOps运维平台实施流量管控
- 如何部署对接DevOps运维平台的控制台
- 如何对接DevOps运维平台执行半自动化蓝绿灰度发布
- 如何使用DevOps运维平台对接的公共接口
- 如何设计全链路智能编排高级蓝绿灰度发布界面
- 如何实现Windows10下GraalVM本地镜像化
- 蓝绿灰度发布
- 流量染色
- 隔离路由
- 故障转移
- 多活单元化
- 限流熔断降级权限
- 网关动态路由
- 可观测监控
- 如何操作配置中心
- 如何理解框架开关配置
- 如何理解规则策略里内容格式配置
- 如何操作网关和服务的蓝绿灰度发布规则策略配置
- 如何操作网关动态路由规则策略配置
- 如何操作Sentinel规则策略配置
- 如何实施规则策略配置和业务配置在配置中心的合并和分离
- 如何理解自动扫描目录
- 如何自定义流量管控
- 如何自定义实现组合式的防护
- 如何自定义高级配置订阅功能
- 如何自定义订阅框架事件
- 如何自定义解决业务自身跨线程上下文切换的问题
- 如何自定义重用框架内置的Swagger模块
- 如何自定义Header全链路传递
- 如何遵循Nepxion Discovery网关标准实现对其它网关全链路流量管控的二次开发
- 如何遵循Nepxion Discovery服务标准实现对消息队列等其它中间件全链路流量管控的二次开发
- 如何解决从Spring Cloud低版本升级到Spring Cloud 20xx版不兼容的问题
- 如何打造兼容Spring-Cloud-2023(含)以上版本的通用框架
- 如何优化Spring-Cloud-2024(含)以上版本的负载均衡性能
- 如何规避Spring-Cloud-2021(含)以上版本OpenFeign传递Json格式的Header值被截断的问题
- 如何设置Spring-Cloud的G版和H版差异化配置
- 如何让Nepxion Discovery 6.x.x最新版本降级使用低版本的Spring Cloud Alibaba
- 如何解决IDEA-2023.3(含)以上版本运行的服务传递Header丢失的问题