-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Description
I conducted a test where I added an unreachable node upstream of the active check. I found that the status of this node was also "health". I can guarantee that it is unreachable. Upon observing the forwarding logs of this route, I noticed that some traffic to this node would time out or server err (502) before being forwarded to a normal node (200) .
I set up a simple HTTP program locally (ensuring it is reachable). After making a request, I found that there was no proactive health check request sent from apisix. When I stopped one of the processes, although the apisix log seemed to mark that upstream as unhealthy, requests were still passing through it and only got forwarded to the healthy node after it became unreachable.
This is my upstream:
{
"nodes": [
{
"host": "192.168.8.25",
"port": 8001,
"weight": 1
},
{
"host": "192.168.8.25",
"port": 8002,
"weight": 1
}
],
"timeout": {
"connect": 6,
"send": 600,
"read": 600
},
"type": "roundrobin",
"checks": {
"active": {
"concurrency": 10,
"healthy": {
"http_statuses": [
200,
302,
404
],
"interval": 2,
"successes": 1
},
"http_path": "/health",
"timeout": 3,
"type": "http",
"unhealthy": {
"http_failures": 2,
"http_statuses": [
429,
500,
501,
502,
503,
504,
505,
499
],
"interval": 2,
"tcp_failures": 2,
"timeouts": 2
}
}
},
"scheme": "http",
"pass_host": "pass",
"name": "test-health",
"keepalive_pool": {
"idle_timeout": 60,
"requests": 1000,
"size": 320
}
}
Router:
{
"uri": "/*",
"name": "rotuer-4-test-healthy-new",
"desc": "临时测试主动健康检查",
"methods": [
"GET",
"POST",
"PUT",
"DELETE",
"PATCH",
"HEAD",
"OPTIONS",
"CONNECT",
"TRACE"
],
"host": "healthcheck.com",
"upstream_id": "575513659149648574",
"enable_websocket": true,
"status": 1
}
Logs:
192.168.8.25 - - [15/Jul/2025:17:41:11 +0800] healthcheck.com "GET /ping HTTP/1.1" 200 18 0.004 "-" "curl/7.68.0" 192.168.8.25:8002 200 0.002 "http://healthcheck.com"
2025/07/15 17:41:17 [error] 2599#2599: *16525027 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping", host: "healthcheck.com"
2025/07/15 17:41:17 [warn] 2599#2599: *16525027 [lua] healthcheck.lua:1383: log(): [healthcheck] (upstream#/apisix/upstreams/575513659149648574) unhealthy TCP increment (2/2) for '(192.168.8.25:8002)' while connecting to upstream, client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping", host: "healthcheck.com"
192.168.8.25 - - [15/Jul/2025:17:41:17 +0800] healthcheck.com "GET /ping HTTP/1.1" 200 18 0.009 "-" "curl/7.68.0" 192.168.8.25:8002, 192.168.8.25:8001 502, 200 0.001, 0.004 "http://healthcheck.com"
2025/07/15 17:41:22 [error] 2327#2327: *16525984 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping", host: "healthcheck.com"
2025/07/15 17:41:22 [warn] 2327#2327: *16525984 [lua] healthcheck.lua:1383: log(): [healthcheck] (upstream#/apisix/upstreams/575513659149648574) unhealthy TCP increment (3/2) for '(192.168.8.25:8002)' while connecting to upstream, client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping", host: "healthcheck.com"
192.168.8.25 - - [15/Jul/2025:17:41:22 +0800] healthcheck.com "GET /ping HTTP/1.1" 200 18 0.006 "-" "curl/7.68.0" 192.168.8.25:8002, 192.168.8.25:8001 502, 200 0.001, 0.002 "http://healthcheck.com"
2025/07/15 17:41:27 [error] 2355#2355: *16527055 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping", host: "healthcheck.com"
2025/07/15 17:41:27 [warn] 2355#2355: *16527055 [lua] healthcheck.lua:1383: log(): [healthcheck] (upstream#/apisix/upstreams/575513659149648574) unhealthy TCP increment (4/2) for '(192.168.8.25:8002)' while connecting to upstream, client: 192.168.8.25, server: _, request: "GET /ping HTTP/1.1", upstream: "http://192.168.8.25:8002/ping", host: "healthcheck.com"
192.168.8.25 - - [15/Jul/2025:17:41:27 +0800] healthcheck.com "GET /ping HTTP/1.1" 200 18 0.005 "-" "curl/7.68.0" 192.168.8.25:8002, 192.168.8.25:8001 502, 200 0.001, 0.002 "http://healthcheck.com"
And I found healthcheck's result is endpoints are all healthy:
{"name":"/apisix/upstreams/575513659149648574","nodes":[{"counter":{"success":0,"tcp_failure":0,"timeout_failure":0,"http_failure":0},"port":8001,"status":"healthy","hostname":"192.168.8.25","ip":"192.168.8.25"},{"counter":{"success":0,"tcp_failure":0,"timeout_failure":0,"http_failure":0},"port":8002,"status":"healthy","hostname":"192.168.8.25","ip":"192.168.8.25"}],"type":"http"}
Environment
- APISIX version (run
apisix version
): v3.9.1 - Operating system (run
uname -a
): centos7.9 - OpenResty / Nginx version (run
openresty -V
ornginx -V
): 1.25.3.1 - etcd version, if relevant (run
curl http://127.0.0.1:9090/v1/server_info
): 3.4.13 - APISIX Dashboard version, if relevant: 3.0.1
Metadata
Metadata
Assignees
Labels
Type
Projects
Status