Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird 3.7.0 startup error logs, this time cloned my larger stage db into dev env #13250

Open
1 task done
jeremyjpj0916 opened this issue Jun 20, 2024 · 33 comments
Open
1 task done

Comments

@jeremyjpj0916
Copy link
Contributor

jeremyjpj0916 commented Jun 20, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

3.7.0

Current Behavior

When Kong 3.7.0 starts up post migration on a migration up and finish db from 2.8.x I see this in stdout logs. The [error] logs only occur on startup though. Then Kong afterward passes all my functional test suite tests and the admin-api calls are behaving as expected. The same exact container image of Kong for us and nginx template etc. using dev database(40 proxies) with way fewer resources in it does not cause Kong to startup with any [error] logs so maybe its something about the size of the stage kong gateway with startup resources(11,0000 proxies) compared to dev.

Logs are here:

2024-06-20T18:09:37.180466333-05:00 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /usr/local/kong/nginx.conf:10
2024-06-20T18:09:37.180899881-05:00 nginx: [warn] load balancing method redefined in /usr/local/kong/nginx.conf:129
2024-06-20T18:09:37.414676740-05:00 2024/06/20 23:09:37 [info] 30#0: [lua] node.lua:294: new(): kong node-id: 4143bf48-c58a-43d0-b040-b87b7c363805
2024-06-20T18:09:39.343743229-05:00 2024/06/20 23:09:37 [notice] 30#0: [lua] init.lua:788: init(): [request-debug] token for request debugging: 6032b48d-fb1d-4649-a9b7-edb9ff9af27a
2024-06-20T18:09:39.496560922-05:00 2024/06/20 23:09:37 [notice] 30#0: using the "epoll" event method
2024-06-20T18:09:39.496592341-05:00 2024/06/20 23:09:37 [notice] 30#0: openresty/1.25.3.1
2024-06-20T18:09:39.496596318-05:00 2024/06/20 23:09:37 [notice] 30#0: OS: Linux 4.18.0-553.5.1.el8_10.x86_64
2024-06-20T18:09:39.496599124-05:00 2024/06/20 23:09:37 [notice] 30#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024-06-20T18:09:39.496606898-05:00 2024/06/20 23:09:37 [notice] 30#0: start worker processes
2024-06-20T18:09:39.499569575-05:00 2024/06/20 23:09:37 [notice] 30#0: start worker process 48
2024-06-20T18:09:39.502218698-05:00 2024/06/20 23:09:37 [notice] 30#0: start worker process 49
2024-06-20T18:09:39.505492546-05:00 2024/06/20 23:09:37 [notice] 30#0: start worker process 50
2024-06-20T18:09:39.571858055-05:00 2024/06/20 23:09:39 [info] 50#0: *3 [kong] handler.lua:90 [acme] acme renew timer started on worker 2, context: init_worker_by_lua*
2024-06-20T18:09:39.573299354-05:00 2024/06/20 23:09:39 [info] 49#0: *2 [kong] handler.lua:90 [acme] acme renew timer started on worker 1, context: init_worker_by_lua*
2024-06-20T18:09:39.573873264-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'services' into the core_cache..., context: init_worker_by_lua*
2024-06-20T18:09:39.894811591-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'services' into the core_cache (in 321ms), context: init_worker_by_lua*
2024-06-20T18:09:39.894864470-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'consumers' into the cache..., context: init_worker_by_lua*
2024-06-20T18:09:39.952421494-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'consumers' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:09:39.952455698-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'acls' into the cache..., context: init_worker_by_lua*
2024-06-20T18:09:44.758159816-05:00 2024/06/20 23:09:44 [error] 50#0: *5 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:44.767774995-05:00 2024/06/20 23:09:44 [error] 49#0: *657 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:49.780812195-05:00 2024/06/20 23:09:49 [error] 50#0: *1310 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:49.812221175-05:00 2024/06/20 23:09:49 [error] 49#0: *1312 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:54.818557944-05:00 2024/06/20 23:09:54 [error] 50#0: *1314 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:54.831083443-05:00 2024/06/20 23:09:54 [error] 49#0: *1316 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:59.843025670-05:00 2024/06/20 23:09:59 [error] 50#0: *1318 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:09:59.862337952-05:00 2024/06/20 23:09:59 [error] 49#0: *1320 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:04.884240000-05:00 2024/06/20 23:10:04 [error] 50#0: *1322 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:04.905084109-05:00 2024/06/20 23:10:04 [error] 49#0: *1324 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:07.862004121-05:00 2024/06/20 23:10:07 [info] 50#0: *1332 client sent HTTP/1.1 request without "Host" header while reading client request headers, client: xxx.xxx.xxx.xxx, server: kong_mtls, request: "GET /F5/status HTTP/1.1"
2024-06-20T18:10:08.089717306-05:00 2024/06/20 23:10:08 [info] 49#0: *1336 client sent HTTP/1.1 request without "Host" header while reading client request headers, client: xxx.xxx.xxx.xxx, server: kong_mtls, request: "GET /F5/status HTTP/1.1"
2024-06-20T18:10:09.916739532-05:00 2024/06/20 23:10:09 [error] 50#0: *1326 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:09.927549290-05:00 2024/06/20 23:10:09 [error] 49#0: *1328 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:10.598807770-05:00 2024/06/20 23:10:10 [info] 49#0: *1386 client sent HTTP/1.1 request without "Host" header while reading client request headers, client: xxx.xxx.xxx.xxx, server: kong_mtls, request: "GET /F5/status HTTP/1.1"
2024-06-20T18:10:10.651244111-05:00 2024/06/20 23:10:10 [info] 49#0: *1388 client sent HTTP/1.1 request without "Host" header while reading client request headers, client: xxx.xxx.xxx.xxx, server: kong_mtls, request: "GET /F5/status HTTP/1.1"
2024-06-20T18:10:13.064882469-05:00 2024/06/20 23:10:13 [info] 50#0: *1409 client sent HTTP/1.1 request without "Host" header while reading client request headers, client: xxx.xxx.xxx.xxx, server: kong_mtls, request: "GET /F5/status HTTP/1.1"
2024-06-20T18:10:14.936674882-05:00 2024/06/20 23:10:14 [error] 50#0: *1382 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:14.944260754-05:00 2024/06/20 23:10:14 [error] 49#0: *1384 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T18:10:15.765563433-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'acls' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:10:15.765593159-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'certificates' into the core_cache..., context: init_worker_by_lua*
2024-06-20T18:10:15.784923804-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'certificates' into the core_cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:10:15.784954942-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'jwt_secrets' into the cache..., context: init_worker_by_lua*
2024-06-20T18:10:15.860489958-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'jwt_secrets' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:10:15.860512330-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'oauth2_credentials' into the cache..., context: init_worker_by_lua*
2024-06-20T18:10:15.943188159-05:00 2024/06/20 23:09:39 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'oauth2_credentials' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:10:15.944383148-05:00 2024/06/20 23:10:15 [info] 48#0: *1 [kong] handler.lua:90 [acme] acme renew timer started on worker 0, context: init_worker_by_lua*
2024-06-20T18:10:16.406639963-05:00 2024/06/20 23:10:16 [notice] 48#0: *1457 [lua] warmup.lua:47: warming up DNS entries ..., context: ngx.timer
2024-06-20T18:10:16.412809584-05:00 2024/06/20 23:10:16 [info] 48#0: *1432 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412835042-05:00 2024/06/20 23:10:16 [error] 48#0: *1432 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412839089-05:00 2024/06/20 23:10:16 [info] 48#0: *1433 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412849869-05:00 2024/06/20 23:10:16 [error] 48#0: *1433 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412882029-05:00 2024/06/20 23:10:16 [info] 48#0: *1434 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412885466-05:00 2024/06/20 23:10:16 [error] 48#0: *1434 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412888161-05:00 2024/06/20 23:10:16 [info] 48#0: *1435 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412890575-05:00 2024/06/20 23:10:16 [error] 48#0: *1435 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412893571-05:00 2024/06/20 23:10:16 [info] 48#0: *1436 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412896336-05:00 2024/06/20 23:10:16 [error] 48#0: *1436 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412948283-05:00 2024/06/20 23:10:16 [info] 48#0: *1437 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412966918-05:00 2024/06/20 23:10:16 [error] 48#0: *1437 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412972137-05:00 2024/06/20 23:10:16 [info] 48#0: *1438 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.412976375-05:00 2024/06/20 23:10:16 [error] 48#0: *1438 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413090568-05:00 2024/06/20 23:10:16 [info] 48#0: *1439 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413101118-05:00 2024/06/20 23:10:16 [error] 48#0: *1439 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413104194-05:00 2024/06/20 23:10:16 [info] 48#0: *1440 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413106819-05:00 2024/06/20 23:10:16 [error] 48#0: *1440 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413109383-05:00 2024/06/20 23:10:16 [info] 48#0: *1441 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413116877-05:00 2024/06/20 23:10:16 [error] 48#0: *1441 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413124692-05:00 2024/06/20 23:10:16 [info] 48#0: *1442 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413127597-05:00 2024/06/20 23:10:16 [error] 48#0: *1442 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413130092-05:00 2024/06/20 23:10:16 [info] 48#0: *1443 writev() failed (32: Broken pipe), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T18:10:16.413132547-05:00 2024/06/20 23:10:16 [error] 48#0: *1443 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
... Regular kong logs (no more [error] from here on out) ...

Note: was gonna grab debug logs for yall but wayyyyy too much crazy spew, my little stdout view window could not grab it. In INFO mode was able to at least grab it though.

Edit - also rolled back my dev env to point back at the smaller dev env database and then all those error logs go away, startup with smaller dev db output looks like this(did not touch any other env variable or nginx template stuff between the cutover):

2024-06-20T18:37:01.284673925-05:00 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /usr/local/kong/nginx.conf:10
2024-06-20T18:37:01.285394429-05:00 nginx: [warn] load balancing method redefined in /usr/local/kong/nginx.conf:129
2024-06-20T18:37:02.148240585-05:00 2024/06/20 23:37:01 [notice] 30#0: [lua] init.lua:788: init(): [request-debug] token for request debugging: 75fd8688-6247-4fb7-a6bf-ec4b91ded011
2024-06-20T18:37:02.187505725-05:00 2024/06/20 23:37:01 [notice] 30#0: using the "epoll" event method
2024-06-20T18:37:02.187533207-05:00 2024/06/20 23:37:01 [notice] 30#0: openresty/1.25.3.1
2024-06-20T18:37:02.187536944-05:00 2024/06/20 23:37:01 [notice] 30#0: OS: Linux 4.18.0-513.24.1.el8_9.x86_64
2024-06-20T18:37:02.187539319-05:00 2024/06/20 23:37:01 [notice] 30#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024-06-20T18:37:02.187618318-05:00 2024/06/20 23:37:01 [notice] 30#0: start worker processes
2024-06-20T18:37:02.189823833-05:00 2024/06/20 23:37:01 [notice] 30#0: start worker process 47
2024-06-20T18:37:02.190784422-05:00 2024/06/20 23:37:01 [notice] 30#0: start worker process 48
2024-06-20T18:37:02.191560470-05:00 2024/06/20 23:37:01 [notice] 30#0: start worker process 49
2024-06-20T18:37:02.252542016-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'services' into the core_cache..., context: init_worker_by_lua*
2024-06-20T18:37:02.306291463-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'services' into the core_cache (in 53ms), context: init_worker_by_lua*
2024-06-20T18:37:02.306330517-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'consumers' into the cache..., context: init_worker_by_lua*
2024-06-20T18:37:02.330019856-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'consumers' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:37:02.330141707-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'acls' into the cache..., context: init_worker_by_lua*
2024-06-20T18:37:02.586964247-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'acls' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:37:02.587040782-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'certificates' into the core_cache..., context: init_worker_by_lua*
2024-06-20T18:37:02.601523455-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'certificates' into the core_cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:37:02.601580053-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'jwt_secrets' into the cache..., context: init_worker_by_lua*
2024-06-20T18:37:02.617021461-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'jwt_secrets' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:37:02.617079591-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'oauth2_credentials' into the cache..., context: init_worker_by_lua*
2024-06-20T18:37:02.633836008-05:00 2024/06/20 23:37:02 [notice] 47#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'oauth2_credentials' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T18:37:02.669884252-05:00 2024/06/20 23:37:02 [notice] 47#0: *1315 [lua] warmup.lua:47: warming up DNS entries ..., context: ngx.timer
2024-06-20T18:37:35.103123251-05:00 2024/06/20 23:37:35 [notice] 47#0: *1315 [lua] warmup.lua:81: finished warming up DNS entries' into the cache (in 32433ms), context: ngx.timer
... Normal logs from here on out...

Expected Behavior

No error logs on startup.

Steps To Reproduce

No response

Anything else?

No response

@jeremyjpj0916 jeremyjpj0916 changed the title More weird 3.7.0 logs, this time cloned my larger stage db into dev and watched startup logs Weird 3.7.0 startup error logs, this time cloned my larger stage db into dev env Jun 20, 2024
@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 21, 2024

Maybe this was plays a role?

        - name: KONG_LUA_SOCKET_POOL_SIZE
          value: "30"

I noticed default is 256 w Kong. Not sure why I had it so low but just bumped that to 256 to match default.

@StarlightIbuki
Copy link
Contributor

Looks like that solves your issue. The log says that DP fails to connect to CP and what you found could be the cause.

@StarlightIbuki
Copy link
Contributor

Please feel free to reopen the ticket if you have further concerns.

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 21, 2024

Not confirmed yet but will let yall know. testing out that change now.

I run traditional mode here so no DP/CP setup at all.

@jeremyjpj0916
Copy link
Contributor Author

@StarlightIbuki nope it still an issue even bumping that to 256:

2024-06-20T23:44:08.208226437-05:00 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /usr/local/kong/nginx.conf:10
2024-06-20T23:44:08.208894511-05:00 nginx: [warn] load balancing method redefined in /usr/local/kong/nginx.conf:129
2024-06-20T23:44:10.571824051-05:00 2024/06/21 04:44:08 [notice] 30#0: [lua] init.lua:788: init(): [request-debug] token for request debugging: bf683cd8-4e7e-4f85-bea2-ccc53306be42
2024-06-20T23:44:10.719580663-05:00 2024/06/21 04:44:08 [notice] 30#0: using the "epoll" event method
2024-06-20T23:44:10.719615449-05:00 2024/06/21 04:44:08 [notice] 30#0: openresty/1.25.3.1
2024-06-20T23:44:10.719618512-05:00 2024/06/21 04:44:08 [notice] 30#0: OS: Linux 4.18.0-513.18.1.el8_9.x86_64
2024-06-20T23:44:10.719620827-05:00 2024/06/21 04:44:08 [notice] 30#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024-06-20T23:44:10.719649294-05:00 2024/06/21 04:44:08 [notice] 30#0: start worker processes
2024-06-20T23:44:10.724480258-05:00 2024/06/21 04:44:08 [notice] 30#0: start worker process 48
2024-06-20T23:44:10.726169587-05:00 2024/06/21 04:44:08 [notice] 30#0: start worker process 49
2024-06-20T23:44:10.727854741-05:00 2024/06/21 04:44:08 [notice] 30#0: start worker process 50
2024-06-20T23:44:10.729952978-05:00 2024/06/21 04:44:08 [notice] 30#0: start worker process 51
2024-06-20T23:44:10.814475714-05:00 2024/06/21 04:44:10 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'services' into the core_cache..., context: init_worker_by_lua*
2024-06-20T23:44:11.120375447-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'services' into the core_cache (in 305ms), context: init_worker_by_lua*
2024-06-20T23:44:11.120393391-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'consumers' into the cache..., context: init_worker_by_lua*
2024-06-20T23:44:11.178344739-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'consumers' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T23:44:11.178359271-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'acls' into the cache..., context: init_worker_by_lua*
2024-06-20T23:44:15.980237439-05:00 2024/06/21 04:44:15 [error] 49#0: *6 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:15.982404855-05:00 2024/06/21 04:44:15 [error] 51#0: *10 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:16.008796729-05:00 2024/06/21 04:44:16 [error] 50#0: *1301 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:21.014687655-05:00 2024/06/21 04:44:21 [error] 51#0: *1964 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:21.029456935-05:00 2024/06/21 04:44:21 [error] 49#0: *1966 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:21.043850043-05:00 2024/06/21 04:44:21 [error] 50#0: *1968 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:26.050057206-05:00 2024/06/21 04:44:26 [error] 51#0: *1970 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:26.059817019-05:00 2024/06/21 04:44:26 [error] 50#0: *1972 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:26.066641984-05:00 2024/06/21 04:44:26 [error] 49#0: *1974 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:31.060790842-05:00 2024/06/21 04:44:31 [error] 51#0: *1976 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:31.082047668-05:00 2024/06/21 04:44:31 [error] 49#0: *1978 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:31.105887063-05:00 2024/06/21 04:44:31 [error] 50#0: *1980 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:36.100812513-05:00 2024/06/21 04:44:36 [error] 49#0: *1982 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:36.108819551-05:00 2024/06/21 04:44:36 [error] 51#0: *1984 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:36.141029361-05:00 2024/06/21 04:44:36 [error] 50#0: *1986 [lua] worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-20T23:44:38.223902407-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'acls' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T23:44:38.223919390-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'certificates' into the core_cache..., context: init_worker_by_lua*
2024-06-20T23:44:38.247494398-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'certificates' into the core_cache (in 0ms), context: init_worker_by_lua*
2024-06-20T23:44:38.247507871-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'jwt_secrets' into the cache..., context: init_worker_by_lua*
2024-06-20T23:44:38.322106912-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'jwt_secrets' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T23:44:38.322147328-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'oauth2_credentials' into the cache..., context: init_worker_by_lua*
2024-06-20T23:44:38.407124211-05:00 2024/06/21 04:44:11 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'oauth2_credentials' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-20T23:44:38.786897149-05:00 2024/06/21 04:44:38 [notice] 48#0: *2015 [lua] warmup.lua:47: warming up DNS entries ..., context: ngx.timer
2024-06-20T23:44:38.795954003-05:00 2024/06/21 04:44:38 [error] 48#0: *1994 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.795965848-05:00 2024/06/21 04:44:38 [error] 48#0: *1995 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.795971573-05:00 2024/06/21 04:44:38 [error] 48#0: *1996 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.795979052-05:00 2024/06/21 04:44:38 [error] 48#0: *1997 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.795987110-05:00 2024/06/21 04:44:38 [error] 48#0: *1998 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796018705-05:00 2024/06/21 04:44:38 [error] 48#0: *1999 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796038641-05:00 2024/06/21 04:44:38 [error] 48#0: *2000 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796065213-05:00 2024/06/21 04:44:38 [error] 48#0: *2001 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796087498-05:00 2024/06/21 04:44:38 [error] 48#0: *2002 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796107384-05:00 2024/06/21 04:44:38 [error] 48#0: *2003 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796131415-05:00 2024/06/21 04:44:38 [error] 48#0: *2004 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796193402-05:00 2024/06/21 04:44:38 [error] 48#0: *2005 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796240682-05:00 2024/06/21 04:44:38 [error] 48#0: *2006 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796252091-05:00 2024/06/21 04:44:38 [error] 48#0: *2007 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-20T23:44:38.796287201-05:00 2024/06/21 04:44:38 [error] 48#0: *2008 [lua] broker.lua:111: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"

Happy to show any kong engineers the issue on a call if they were curious, it must be somewhat related to that new events server block yall do w the kong_worker_events stuff.

@jeremyjpj0916
Copy link
Contributor Author

Has to be some kinda resource scale issue or warmup problem in my env paired with newer kong version causing the shakey startup. Do yall have any OSS or enterprise customers running 3.x with 11,000+ proxies etc. ? Am thinking they may see the same thing.

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 21, 2024

@StarlightIbuki can you re-open this? Vs me making a brand new issue that will just clone the deets I have given here.

Edit - Remade it here since was not reopened - #13274

@jeremyjpj0916
Copy link
Contributor Author

Other ideas I may have that might cause Kong to do this would be what happens if an entity tries to PUT against the admin API while Kong is still trying to startup or things like that may be impacting that output? I have a healthcheck thing that I use that tries to constantly PUTs a dummy unused upstreams resource to help me check for correct db write access. Its the only resource in that table too we don't use upstreams regularly.

@StarlightIbuki
Copy link
Contributor

I run traditional mode here so no DP/CP setup at all.

That's strange. The log states that communicate() is being called, which should only happen in DPs.
Could you share the configuration (kong.conf, environment variables) that you are using?

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 24, 2024

@StarlightIbuki

I don't override the default conf file at all but we do use ENV variables to dictate our runtime as well as a custom NGINX template.

Current ENV Variables in my K8S Deployment(XXXX's for block outs):

      - env:
        - name: KONG_NGINX_MAIN_ENV
          value: "SPLUNK_HOST; env KONG_SSL_CERT_KEY; env KONG_SSL_CERT_DER"
        - name: KONG_ADMIN_GUI_LISTEN
          value: "0.0.0.0:8002, 0.0.0.0:8445"
        - name: KONG_NGINX_HTTP_KEEPALIVE_REQUESTS
          value: "50000"
        - name: KONG_NGINX_HTTP_KEEPALIVE_TIMEOUT
          value: "120s"
        - name: KONG_NGINX_HTTP_CLIENT_MAX_BODY_SIZE
          value: "50m"
        - name: KONG_NGINX_HTTP_CLIENT_BODY_BUFFER_SIZE
          value: "50m"          
        - name: KONG_NGINX_HTTP_LARGE_CLIENT_HEADER_BUFFERS
          value: "2 20k" 
        - name: KONG_NGINX_HTTP_TCP_NODELAY
          value: "on" 
        - name: KONG_NGINX_HTTP_TCP_NOPUSH
          value: "on"
        - name: KONG_NGINX_HTTP_PROXY_BUFFER_SIZE
          value: "32k"           
        - name: KONG_NGINX_HTTP_PROXY_BUFFERS
          value: "10 2m"
        - name: KONG_NGINX_HTTP_PROXY_BUSY_BUFFERS_SIZE
          value: "4m"
        - name: KONG_NGINX_HTTP_RESET_TIMEDOUT_CONNECTION
          value: "on"
        - name: KONG_NGINX_HTTP_HTTP2_RECV_BUFFER_SIZE
          value: "20m"
        - name: KONG_NGINX_HTTP_HTTP2_BODY_PREREAD_SIZE
          value: "2m"  
        - name: KONG_NGINX_UPSTREAM_KEEPALIVE
          value: "90"
        - name: KONG_NGINX_UPSTREAM_KEEPALIVE_REQUESTS
          value: "50000"
        - name: KONG_NGINX_UPSTREAM_KEEPALIVE_TIMEOUT
          value: "120"
        - name: KONG_UPSTREAM_KEEPALIVE_POOL_SIZE
          value: "1024"
        - name: KONG_UPSTREAM_KEEPALIVE_MAX_REQUESTS
          value: "50000"
        - name: KONG_UPSTREAM_KEEPALIVE_IDLE_TIMEOUT
          value: "120" 
        - name: KONG_UNTRUSTED_LUA
          value: "on"
        - name: CRL_DOWNLOAD_URL
          value: XXXXXXXXX
        - name: KONG_NGINX_MAIN_WORKER_PROCESSES
          value: "3"
        - name: KONG_LOG_LEVEL
          value: notice
        - name: KONG_PROXY_ACCESS_LOG
          value: "off"
        - name: KONG_ADMIN_ACCESS_LOG
          value: "off"
        - name: KONG_PROXY_ERROR_LOG
          value: /dev/stderr
        - name: KONG_ADMIN_ERROR_LOG
          value: /dev/stderr
        - name: KONG_ANONYMOUS_REPORTS
          value: "off"
        - name: KONG_PROXY_LISTEN
          value: 0.0.0.0:8000, 0.0.0.0:8443 ssl http2 deferred reuseport backlog=16384
        - name: KONG_ADMIN_LISTEN
          value: 0.0.0.0:8001 deferred reuseport backlog=16384
        - name: KONG_MEM_CACHE_SIZE
          value: 1024m
        - name: KONG_SSL_CERT
          value: /usr/local/kong/ssl/kongcert.crt
        - name: KONG_SSL_CERT_DER
          value: /usr/local/kong/ssl/kongcertder.der
        - name: KONG_SSL_CERT_KEY
          value: /usr/local/kong/ssl/kongprivatekey.key
        - name: KONG_CLIENT_SSL
          value: "off"
        - name: KONG_ADMIN_SSL_ENABLED
          value: "off"
        - name: KONG_HEADERS
          value: latency_tokens
        - name: KONG_CLIENT_MAX_BODY_SIZE
          value: 50m
        - name: KONG_CLIENT_BODY_BUFFER_SIZE
          value: 50m
        - name: KONG_ERROR_DEFAULT_TYPE
          value: text/plain
        - name: KONG_DATABASE
          value: postgres
        - name: KONG_PG_MAX_CONCURRENT_QUERIES
          value: "33"
        - name: KONG_PG_SSL
          value: "on"
        - name: KONG_PG_SSL_REQUIRED
          value: "on"
        - name: KONG_PG_SSL_VERIFY
          value: "on"
        - name: KONG_PG_TLS_VERSION
          value: "tlsv1_2"
        - name: KONG_PG_HOST
          value: XXXXXXXX
        - name: KONG_PG_PORT
          value: "5432"
        - name: KONG_PG_USER
          valueFrom:
            secretKeyRef:
              key: username
              name: postgres-secret
        - name: KONG_PG_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: postgres-secret
        - name: KONG_PG_TIMEOUT
          value: "8000"
        - name: KONG_PG_DATABASE
          value: XXXXXX
        - name: KONG_PG_RO_SSL
          value: "on"
        - name: KONG_PG_RO_SSL_REQUIRED
          value: "on"
        - name: KONG_PG_RO_SSL_VERIFY
          value: "on"
        - name: KONG_PG_RO_TLS_VERSION
          value: "tlsv1_2"
        - name: KONG_PG_RO_HOST
          value: XXXXXXXX
        - name: KONG_PG_RO_PORT
          value: "5432"
        - name: KONG_PG_RO_USER
          valueFrom:
            secretKeyRef:
              key: username
              name: postgres-secret
        - name: KONG_PG_RO_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: postgres-secret
        - name: KONG_PG_RO_TIMEOUT
          value: "8000"
        - name: KONG_PG_RO_DATABASE
          value: XXXXX
        - name: KONG_DB_UPDATE_FREQUENCY
          value: "5"
        - name: KONG_DB_UPDATE_PROPAGATION
          value: "0"
        - name: KONG_DB_CACHE_TTL
          value: "0"
        - name: KONG_DNS_HOSTSFILE
          value: /etc/hosts
        - name: KONG_DNS_STALE_TTL
          value: "4"
        - name: KONG_DNS_NOT_FOUND_TTL
          value: "10"
        - name: KONG_DNS_RESOLVER
          value: XXXXXX,XXXXXXXX
        - name: KONG_DNS_ERROR_TTL
          value: "1"
        - name: KONG_DNS_NO_SYNC
          value: "off"
        - name: KONG_LUA_SSL_TRUSTED_CERTIFICATE
          value: /usr/local/kong/ssl/kongcert.pem
        - name: KONG_LUA_SSL_VERIFY_DEPTH
          value: "3"
        - name: KONG_LUA_SOCKET_POOL_SIZE
          value: "256"
        - name: SPLUNK_HOST
          value: XXXXXXXXXXXXXXX
        - name: LOCALDOMAIN
          value: .
        - name: RES_OPTIONS
          value: ndots:1
        - name: KONG_DNS_ORDER
          value: LAST,A,SRV,CNAME
        - name: KONG_PLUGINS
          value: bundled,nonamesecurity,kong-plugin-oauth-proxy,kong-kafka-log,stargate-waf-error-log,mtls,kong-tx-debugger,kong-error-log,kong-oidc-implicit-token,kong-response-size-limiting,kong-service-virtualization,kong-cluster-drain,kong-upstream-jwt,kong-splunk-log,kong-spec-expose,kong-path-based-routing,kong-oidc-multi-idp
        - name: KONG_TRUSTED_IPS
          value: 0.0.0.0/0,::/0
        - name: KONG_NGINX_PROXY_REAL_IP_HEADER
          value: X-Forwarded-For
        - name: KONG_NGINX_PROXY_REAL_IP_RECURSIVE
          value: "on"
        - name: KONG_WORKER_CONSISTENCY
          value: eventual
        - name: KONG_WORKER_STATE_UPDATE_FREQUENCY
          value: "5"
        - name: KONG_DB_CACHE_WARMUP_ENTITIES
          value: services,consumers,acls,certificates,jwt_secrets,oauth2_credentials
        - name: LUA_PATH
          value: /usr/local/kong/luarocks/share/lua/5.1/?.lua;;/usr/local/kong/luarocks/share/lua/5.1/?/init.lua;
        - name: KONG_NGINX_HTTP_SSL_PROTOCOLS
          value: TLSv1.2 TLSv1.3
        - name: KONG_SSL_CIPHER_SUITE
          value: intermediate

@jeremyjpj0916
Copy link
Contributor Author

@StarlightIbuki And current nginx template file:

pid pids/nginx.pid;

> if wasm and wasm_dynamic_module then
load_module $(wasm_dynamic_module);
> end

error_log ${{PROXY_ERROR_LOG}} ${{LOG_LEVEL}};

> if database == "off" then
lmdb_environment_path ${{LMDB_ENVIRONMENT_PATH}};
lmdb_map_size         ${{LMDB_MAP_SIZE}};

> if lmdb_validation_tag then
lmdb_validation_tag   $(lmdb_validation_tag);
> end

> end

# injected nginx_main_* directives
> for _, el in ipairs(nginx_main_directives) do
$(el.name) $(el.value);
> end

events {
    # injected nginx_events_* directives
> for _, el in ipairs(nginx_events_directives) do
    $(el.name) $(el.value);
> end
}

> if wasm then
wasm {
> for _, el in ipairs(nginx_wasm_main_shm_kv_directives) do
  shm_kv $(el.name) $(el.value);
> end

> for _, module in ipairs(wasm_modules_parsed) do
  module $(module.name) $(module.path);
> end

> for _, el in ipairs(nginx_wasm_main_directives) do
> if el.name == "shm_kv" then
  shm_kv * $(el.value);
> else
  $(el.name) $(el.value);
> end
> end

> if #nginx_wasm_wasmtime_directives > 0 then
  wasmtime {
> for _, el in ipairs(nginx_wasm_wasmtime_directives) do
    flag $(el.name) $(el.value);
> end
  }
> end -- wasmtime

> if #nginx_wasm_v8_directives > 0 then
  v8 {
> for _, el in ipairs(nginx_wasm_v8_directives) do
    flag $(el.name) $(el.value);
> end
  }
> end -- v8

> if #nginx_wasm_wasmer_directives > 0 then
  wasmer {
> for _, el in ipairs(nginx_wasm_wasmer_directives) do
    flag $(el.name) $(el.value);
> end
  }
> end -- wasmer

}
> end


http {
  server_tokens off;

  error_log ${{PROXY_ERROR_LOG}} ${{LOG_LEVEL}};

uninitialized_variable_warn  off;

# enable capturing error logs
lua_capture_error_log 100k;

# exclusive oauth proxy shm caches for consumers and their cached token
lua_shared_dict kong_oauth_cache2       5m;
lua_shared_dict kong_oauth_cache_miss2  2m;
lua_shared_dict kong_oauth_cache_locks2 1m;

# exclusive oauth proxy token shm caches
lua_shared_dict kong_oauth_token_cache2       5m;
lua_shared_dict kong_oauth_token_cache_miss2  2m;
lua_shared_dict kong_oauth_token_cache_locks2 1m;

# exclusive oidc shm caches
lua_shared_dict kong_oidc_cache      10m;
lua_shared_dict kong_oidc_cache_miss  6m;
lua_shared_dict kong_oidc_cache_locks 3m;

lua_package_path       '${{LUA_PACKAGE_PATH}};;';
lua_package_cpath      '${{LUA_PACKAGE_CPATH}};;';
lua_socket_pool_size   ${{LUA_SOCKET_POOL_SIZE}};
lua_socket_log_errors  off;
lua_max_running_timers 4096;
lua_max_pending_timers 16384;

lua_ssl_verify_depth   ${{LUA_SSL_VERIFY_DEPTH}};
> if lua_ssl_trusted_certificate_combined then
lua_ssl_trusted_certificate '${{LUA_SSL_TRUSTED_CERTIFICATE_COMBINED}}';
> end
lua_ssl_protocols ${{NGINX_HTTP_LUA_SSL_PROTOCOLS}};


lua_shared_dict kong                        5m;
lua_shared_dict kong_locks                  8m;
lua_shared_dict kong_healthchecks           5m;
lua_shared_dict kong_cluster_events         5m;
lua_shared_dict kong_rate_limiting_counters 12m;
lua_shared_dict kong_core_db_cache          ${{MEM_CACHE_SIZE}};
lua_shared_dict kong_core_db_cache_miss     12m;
lua_shared_dict kong_db_cache               ${{MEM_CACHE_SIZE}};
lua_shared_dict kong_db_cache_miss          12m;
lua_shared_dict kong_secrets                5m;
underscores_in_headers on;

> if ssl_cipher_suite == 'old' then
lua_ssl_conf_command CipherString DEFAULT:@SECLEVEL=0;
proxy_ssl_conf_command CipherString DEFAULT:@SECLEVEL=0;
ssl_conf_command CipherString DEFAULT:@SECLEVEL=0;
grpc_ssl_conf_command CipherString DEFAULT:@SECLEVEL=0;
> end

> if ssl_ciphers then
ssl_ciphers ${{SSL_CIPHERS}};
> end

# injected nginx_http_* directives
> for _, el in ipairs(nginx_http_directives) do
$(el.name) $(el.value);
> end

init_by_lua_block {
> if test and coverage then
    require 'luacov'
    jit.off()
> end -- test and coverage
    Kong = require 'kong'
    Kong.init()
}

init_worker_by_lua_block {
    Kong.init_worker()
}

exit_worker_by_lua_block {
    Kong.exit_worker()
}

> if (role == "traditional" or role == "data_plane") and #proxy_listeners > 0 then
log_format kong_log_format '$remote_addr - $remote_user [$time_local] '
                           '"$request" $status $body_bytes_sent '
                           '"$http_referer" "$http_user_agent" '
                           'kong_request_id: "$kong_request_id"';

# Load variable indexes
lua_kong_load_var_index default;

upstream kong_upstream {
    server 0.0.0.1;

    # injected nginx_upstream_* directives
> for _, el in ipairs(nginx_upstream_directives) do
    $(el.name) $(el.value);
> end

    balancer_by_lua_block {
        Kong.balancer()
    }
}

server {
    server_name kong;
> for _, entry in ipairs(proxy_listeners) do
    listen $(entry.listener);
> end

> for _, entry in ipairs(proxy_listeners) do
> if entry.http2 then
    http2 on;
> break
> end
> end

    error_page 400 404 405 408 411 412 413 414 417 /kong_error_handler;
    error_page 494 =494                            /kong_error_handler;
    error_page 500 502 503 504                     /kong_error_handler;

    # Append the kong request id to the error log
    # https://github.com/Kong/lua-kong-nginx-module#lua_kong_error_log_request_id
    lua_kong_error_log_request_id $kong_request_id;

> if proxy_access_log_enabled then
>   if custom_proxy_access_log then
    access_log ${{PROXY_ACCESS_LOG}};
>   else
    access_log ${{PROXY_ACCESS_LOG}} kong_log_format;
>   end
> else
    access_log off;
> end
    
    error_log  ${{PROXY_ERROR_LOG}} ${{LOG_LEVEL}};

> if proxy_ssl_enabled then
> for i = 1, #ssl_cert do
    ssl_certificate     $(ssl_cert[i]);
    ssl_certificate_key $(ssl_cert_key[i]);
> end
    ssl_session_cache   shared:SSL:${{SSL_SESSION_CACHE_SIZE}};
    ssl_certificate_by_lua_block {
        Kong.ssl_certificate()
    }
> end

    # injected nginx_proxy_* directives
> for _, el in ipairs(nginx_proxy_directives) do
    $(el.name) $(el.value);
> end
> for _, ip in ipairs(trusted_ips) do
    set_real_ip_from $(ip);
> end

    rewrite_by_lua_block {
        Kong.rewrite()
    }

    access_by_lua_block {
        Kong.access()
    }

    header_filter_by_lua_block {
        Kong.header_filter()
        ngx.header["Server"] = nil
        ngx.header["Via"] = nil
    }

    body_filter_by_lua_block {
        Kong.body_filter()
    }

    log_by_lua_block {
        Kong.log()
    }

    location / {
        default_type                     '';

        set $ctx_ref                     '';
        set $upstream_te                 '';
        set $upstream_host               '';
        set $upstream_upgrade            '';
        set $upstream_connection         '';
        set $upstream_scheme             '';
        set $upstream_uri                '';
        set $upstream_x_forwarded_for    '';
        set $upstream_x_forwarded_proto  '';
        set $upstream_x_forwarded_host   '';
        set $upstream_x_forwarded_port   '';
        set $upstream_x_forwarded_path   '';
        set $upstream_x_forwarded_prefix '';
        set $kong_proxy_mode             'http';

        proxy_http_version      1.1;
        proxy_buffering          on;
        proxy_request_buffering  on;

        # injected nginx_location_* directives
> for _, el in ipairs(nginx_location_directives) do
        $(el.name) $(el.value);
> end

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @unbuffered {
        internal;
        default_type         '';
        set $kong_proxy_mode 'unbuffered';

        proxy_http_version      1.1;
        proxy_buffering         off;
        proxy_request_buffering off;

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @unbuffered_request {
        internal;
        default_type         '';
        set $kong_proxy_mode 'unbuffered';

        proxy_http_version      1.1;
        proxy_buffering          on;
        proxy_request_buffering off;

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @unbuffered_response {
        internal;
        default_type         '';
        set $kong_proxy_mode 'unbuffered';

        proxy_http_version      1.1;
        proxy_buffering         off;
        proxy_request_buffering  on;

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @grpc {
        internal;
        default_type         '';
        set $kong_proxy_mode 'grpc';

        grpc_set_header      TE                 $upstream_te;
        grpc_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        grpc_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        grpc_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        grpc_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        grpc_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        grpc_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        grpc_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        grpc_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        grpc_pass_header     Server;
        grpc_pass_header     Date;
        grpc_ssl_name        $upstream_host;
        grpc_ssl_server_name on;
> if client_ssl then
        grpc_ssl_certificate ${{CLIENT_SSL_CERT}};
        grpc_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        grpc_pass            $upstream_scheme://kong_upstream;
    }

    location = /kong_buffered_http {
        internal;
        default_type         '';
        set $kong_proxy_mode 'http';

        rewrite_by_lua_block       {
          -- ngx.localtion.capture will create a new nginx request,
          -- so the upstream ssl-related info attached to the `r` gets lost.
          -- we need to re-set them here to the new nginx request.
          local ctx = ngx.ctx
          local upstream_ssl = require("kong.runloop.upstream_ssl")

          upstream_ssl.set_service_ssl(ctx)
          upstream_ssl.fallback_upstream_client_cert(ctx)
        }
        access_by_lua_block        {;}
        header_filter_by_lua_block {;}
        body_filter_by_lua_block   {;}
        log_by_lua_block           {;}

        proxy_http_version 1.1;
        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location = /kong_error_handler {
        internal;
        default_type                 '';

        rewrite_by_lua_block {;}
        access_by_lua_block  {;}

        content_by_lua_block {
            Kong.handle_error()
        }
    }
}

server {
    server_name kong_mtls;
    listen 0.0.0.0:9443 ssl deferred reuseport backlog=16384;
    http2 on;

    modsecurity ${{WAF}};
    modsecurity_rules_file /usr/local/modsec_includes.conf;

    #WAF_DEBUG_LEVEL level 0 = off 9 = on
    modsecurity_rules '
      SecRuleEngine ${{WAF_MODE}}
      SecRequestBodyLimit ${{WAF_REQUEST_FILE_SIZE_LIMIT}}
      SecRequestBodyNoFilesLimit ${{WAF_REQUEST_NO_FILE_SIZE_LIMIT}}
      SecPcreMatchLimit ${{WAF_PCRE_MATCH_LIMIT}}
      SecPcreMatchLimitRecursion ${{WAF_PCRE_MATCH_LIMIT_RECURSION}}
      SecDebugLogLevel ${{WAF_DEBUG_LEVEL}}
      SecAuditEngine ${{WAF_AUDIT}}
      SecAction "id:900000,phase:1,nolog,pass,t:none,setvar:tx.paranoia_level=${{WAF_PARANOIA_LEVEL}}"
    ';


    # Mutual TLS logic, check if server port is 9443
    ssl_verify_client on;
    ssl_client_certificate /usr/local/kong/stargate-mutual-trust.ca;
    ssl_crl /usr/local/kong/stargate-mutual-trust.crl.pem;
    ssl_verify_depth 3;

    error_page 400 404 405 408 411 412 413 414 417 /kong_error_handler;
    error_page 494 =494                            /kong_error_handler;
    error_page 500 502 503 504                     /kong_error_handler;

    # Append the kong request id to the error log
    # https://github.com/Kong/lua-kong-nginx-module#lua_kong_error_log_request_id
    lua_kong_error_log_request_id $kong_request_id;

> if proxy_access_log_enabled then
>   if custom_proxy_access_log then
    access_log ${{PROXY_ACCESS_LOG}};
>   else
    access_log ${{PROXY_ACCESS_LOG}} kong_log_format;
>   end
> else
    access_log off;
> end

    error_log  ${{PROXY_ERROR_LOG}} ${{LOG_LEVEL}};

    client_body_buffer_size ${{CLIENT_BODY_BUFFER_SIZE}};
    client_header_buffer_size 8k;
    large_client_header_buffers 2 20k;

> if proxy_ssl_enabled then
    # Hardcode the certificate for MTLS we want Stargate to present to clients
    ssl_certificate /usr/local/kong/ssl/kongmtlscert.crt;
    ssl_certificate_key /usr/local/kong/ssl/kongmtlsprivatekey.key;
    
    ssl_session_cache   shared:SSL:10m;
    ssl_certificate_by_lua_block {
        Kong.ssl_certificate()
    }
> end

  # injected nginx_proxy_* directives
> for _, el in ipairs(nginx_proxy_directives) do
  $(el.name) $(el.value);
> end
> for i = 1, #trusted_ips do
  set_real_ip_from  $(trusted_ips[i]);
> end

  rewrite_by_lua_block {
      Kong.rewrite()
  }

  access_by_lua_block {
      Kong.access()
  }

  header_filter_by_lua_block {
      Kong.header_filter()
      ngx.header["Server"] = nil
      ngx.header["Via"] = nil
  }

  body_filter_by_lua_block {
      Kong.body_filter()
  }

  log_by_lua_block {
      Kong.log()
  }

     location / {
        default_type                     '';

        set $ctx_ref                     '';
        set $upstream_te                 '';
        set $upstream_host               '';
        set $upstream_upgrade            '';
        set $upstream_connection         '';
        set $upstream_scheme             '';
        set $upstream_uri                '';
        set $upstream_x_forwarded_for    '';
        set $upstream_x_forwarded_proto  '';
        set $upstream_x_forwarded_host   '';
        set $upstream_x_forwarded_port   '';
        set $upstream_x_forwarded_path   '';
        set $upstream_x_forwarded_prefix '';
        set $kong_proxy_mode             'http';

        proxy_http_version      1.1;
        proxy_buffering          on;
        proxy_request_buffering  on;

        # injected nginx_location_* directives
> for _, el in ipairs(nginx_location_directives) do
        $(el.name) $(el.value);
> end

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @unbuffered {
        internal;
        default_type         '';
        set $kong_proxy_mode 'unbuffered';

        proxy_http_version      1.1;
        proxy_buffering         off;
        proxy_request_buffering off;

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @unbuffered_request {
        internal;
        default_type         '';
        set $kong_proxy_mode 'unbuffered';

        proxy_http_version      1.1;
        proxy_buffering          on;
        proxy_request_buffering off;

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @unbuffered_response {
        internal;
        default_type         '';
        set $kong_proxy_mode 'unbuffered';

        proxy_http_version      1.1;
        proxy_buffering         off;
        proxy_request_buffering  on;

        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location @grpc {
        internal;
        default_type         '';
        set $kong_proxy_mode 'grpc';

        grpc_set_header      TE                 $upstream_te;
        grpc_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        grpc_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        grpc_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        grpc_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        grpc_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        grpc_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        grpc_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        grpc_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        grpc_pass_header     Server;
        grpc_pass_header     Date;
        grpc_ssl_name        $upstream_host;
        grpc_ssl_server_name on;
> if client_ssl then
        grpc_ssl_certificate ${{CLIENT_SSL_CERT}};
        grpc_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        grpc_pass            $upstream_scheme://kong_upstream;
    }

    location = /kong_buffered_http {
        internal;
        default_type         '';
        set $kong_proxy_mode 'http';

        rewrite_by_lua_block       {
          -- ngx.localtion.capture will create a new nginx request,
          -- so the upstream ssl-related info attached to the `r` gets lost.
          -- we need to re-set them here to the new nginx request.
          local ctx = ngx.ctx
          local upstream_ssl = require("kong.runloop.upstream_ssl")

          upstream_ssl.set_service_ssl(ctx)
          upstream_ssl.fallback_upstream_client_cert(ctx)
        }
        access_by_lua_block        {;}
        header_filter_by_lua_block {;}
        body_filter_by_lua_block   {;}
        log_by_lua_block           {;}

        proxy_http_version 1.1;
        proxy_set_header      TE                 $upstream_te;
        proxy_set_header      Host               $upstream_host;
        proxy_set_header      Upgrade            $upstream_upgrade;
        proxy_set_header      Connection         $upstream_connection;
        proxy_set_header      X-Forwarded-For    $upstream_x_forwarded_for;
        proxy_set_header      X-Forwarded-Proto  $upstream_x_forwarded_proto;
        proxy_set_header      X-Forwarded-Host   $upstream_x_forwarded_host;
        proxy_set_header      X-Forwarded-Port   $upstream_x_forwarded_port;
        proxy_set_header      X-Forwarded-Path   $upstream_x_forwarded_path;
        proxy_set_header      X-Forwarded-Prefix $upstream_x_forwarded_prefix;
        proxy_set_header      X-Real-IP          $remote_addr;
> if enabled_headers_upstream["X-Kong-Request-Id"] then
        proxy_set_header      X-Kong-Request-Id  $kong_request_id;
> end
        proxy_pass_header     Server;
        proxy_pass_header     Date;
        proxy_ssl_name        $upstream_host;
        proxy_ssl_server_name on;
> if client_ssl then
        proxy_ssl_certificate ${{CLIENT_SSL_CERT}};
        proxy_ssl_certificate_key ${{CLIENT_SSL_CERT_KEY}};
> end
        proxy_pass            $upstream_scheme://kong_upstream$upstream_uri;
    }

    location = /kong_error_handler {
        internal;
        default_type                 '';

        rewrite_by_lua_block {;}
        access_by_lua_block  {;}

        content_by_lua_block {
            Kong.handle_error()
        }
    }  

}
> end -- (role == "traditional" or role == "data_plane") and #proxy_listeners > 0
> if (role == "control_plane" or role == "traditional") and #admin_listeners > 0 then
server {
    server_name kong_admin;
> for _, entry in ipairs(admin_listeners) do
    listen $(entry.listener);
> end

> for _, entry in ipairs(admin_listeners) do
> if entry.http2 then
    http2 on;
> break
> end
> end

    access_log ${{ADMIN_ACCESS_LOG}};
    error_log  ${{ADMIN_ERROR_LOG}} ${{LOG_LEVEL}};

> if admin_ssl_enabled then
> for i = 1, #admin_ssl_cert do
    ssl_certificate     $(admin_ssl_cert[i]);
    ssl_certificate_key $(admin_ssl_cert_key[i]);
> end
    ssl_session_cache   shared:AdminSSL:10m;
> end
    # injected nginx_admin_* directives
> for _, el in ipairs(nginx_admin_directives) do
    $(el.name) $(el.value);
> end
    location / {
        default_type application/json;
        content_by_lua_block {
            Kong.admin_content()
        }
        header_filter_by_lua_block {
            Kong.admin_header_filter()
        }
      }

      location /robots.txt {
          return 200 'User-agent: *\nDisallow: /';
      }
 }
> end -- (role == "control_plane" or role == "traditional") and #admin_listeners > 0

server {
    charset UTF-8;
    server_name kong_worker_events;
    listen unix:${{PREFIX}}/worker_events.sock;
    access_log off;
    location / {
        content_by_lua_block {
          require("resty.events.compat").run()
        }
    }
}

}

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 24, 2024

@StarlightIbuki

I don't think its ENV variable or template specific though because I can point at a smaller dev db and none of the error logs come up but as soon as I point to the larger DB(just changing postgres hosts and tables to startup and read from) with more resources in it is when the errors show up. I am thinking to prove it out more by adding a ton of dummy data to my dev db environment and see if when i bloat that pg instance with tons of routes/services/plugins/consumer resources that it then mirrors the errors in startup output similar to what happens when pointing at our much larger stage postgres database instance. 50 proxies vs 11,000 proxies.

If I can reproduce it w injecting a ton of dummy data maybe I can drop a posgres dump sql file yall can import to see it behave so in yalls testings too when you import all that data into a sandbox/lab test.

@StarlightIbuki
Copy link
Contributor

@jeremyjpj0916 OK I just figured out that worker.lua and communicate() are from a dependence of Kong so it's totally normal.
I guess the reason is that when loading large DB/rebuilding routers it consumes too much resources, especially CPU, so that it times out sockets.
@chronolaw Could you take a look?

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 25, 2024

@StarlightIbuki @chronolaw To give more context on available resources I have k8s pods starting up with as much as 4 cpus and 10Gi(nginx I set it to spin up 4 nginx worker proce:

        resources:
          limits:
            cpu: "4"
            memory: 10Gi
          requests:
            cpu: "4"
            memory: 10Gi

Which is no small potatoes in terms of compute or memory.

Size of Dev DB(no errors on Kong startup):

postgres=# \l+

   Name    |    Owner    | Encoding | Collate | Ctype |      Access privileges       |  Size   | Tablespace |                Description
-----------+-------------+----------+---------+-------+------------------------------+---------+------------+--------------------------------------------
 kong_dev2 | XXXXX | UTF8     | C       | C     | =Tc/XXXXX             +| 14 MB   | pg_default |

Size of Stage DB(errors on Kong startup, 82 MB is still fairly small DB size IMO):

postgres=# \l+
                                                                     List of databases
    Name     |    Owner    | Encoding | Collate | Ctype |      Access privileges       |  Size   | Tablespace |                Description
-------------+-------------+----------+---------+-------+------------------------------+---------+------------+--------------------------------------------
 kong_stage2 | XXXXX | UTF8     | C       | C     | =Tc/XXXXX             +| 82 MB   | pg_default |

@jeremyjpj0916
Copy link
Contributor Author

schema_meta looks correct too for the stage db(all elements are up to 360 for latest kong version):

kong_stage2=# select * from schema_meta;
     key     |       subsystem       | last_executed  |                                                                                                                                                               executed
                                                                                                                                                      | pending
-------------+-----------------------+----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------+---------
 schema_meta | acl                   | 004_212_to_213 | {000_base_acl,002_130_to_140,003_200_to_210,004_212_to_213}
                                                                                                                                                      | {}
 schema_meta | jwt                   | 003_200_to_210 | {000_base_jwt,002_130_to_140,003_200_to_210}
                                                                                                                                                      | {}
 schema_meta | core                  | 016_280_to_300 | {000_base,003_100_to_110,004_110_to_120,005_120_to_130,006_130_to_140,007_140_to_150,008_150_to_200,009_200_to_210,010_210_to_211,011_212_to_213,012_213_to_220,013_220_to_230,0
14_230_to_270,015_270_to_280,017_300_to_310,018_310_to_320,019_320_to_330,020_330_to_340,021_340_to_350,022_350_to_360,023_360_to_370,016_280_to_300} | {}
 schema_meta | acme                  | 003_350_to_360 | {000_base_acme,001_280_to_300,002_320_to_330,003_350_to_360}
                                                                                                                                                      | {}
 schema_meta | ai-proxy              | 001_360_to_370 | {001_360_to_370}
                                                                                                                                                      | {}
 schema_meta | basic-auth            | 003_200_to_210 | {000_base_basic_auth,002_130_to_140,003_200_to_210}
                                                                                                                                                      | {}
 schema_meta | bot-detection         | 001_200_to_210 | {001_200_to_210}
                                                                                                                                                      | {}
 schema_meta | hmac-auth             | 003_200_to_210 | {000_base_hmac_auth,002_130_to_140,003_200_to_210}
                                                                                                                                                      | {}
 schema_meta | http-log              | 001_280_to_300 | {001_280_to_300}
                                                                                                                                                      | {}
 schema_meta | ip-restriction        | 001_200_to_210 | {001_200_to_210}
                                                                                                                                                      | {}
 schema_meta | key-auth              | 003_200_to_210 | {000_base_key_auth,002_130_to_140,004_320_to_330,003_200_to_210}
                                                                                                                                                      | {}
 schema_meta | opentelemetry         | 001_331_to_332 | {001_331_to_332}
                                                                                                                                                      | {}
 schema_meta | post-function         | 001_280_to_300 | {001_280_to_300}
                                                                                                                                                      | {}
 schema_meta | pre-function          | 001_280_to_300 | {001_280_to_300}
                                                                                                                                                      | {}
 schema_meta | oauth2                | 007_320_to_330 | {000_base_oauth2,003_130_to_140,004_200_to_210,005_210_to_211,006_320_to_330,007_320_to_330}
                                                                                                                                                      | {}
 schema_meta | session               | 002_320_to_330 | {000_base_session,001_add_ttl_index,002_320_to_330}
                                                                                                                                                      |
 schema_meta | rate-limiting         | 006_350_to_360 | {000_base_rate_limiting,003_10_to_112,004_200_to_210,005_320_to_330,006_350_to_360}
                                                                                                                                                      | {}
 schema_meta | response-ratelimiting | 001_350_to_360 | {000_base_response_rate_limiting,001_350_to_360}
                                                                                                                                                      | {}
(18 rows)

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 25, 2024

Plugins takes the cake for largest table expectedly:

kong_stage2=# \dt+
                                                   List of relations
 Schema |             Name              | Type  |    Owner    | Persistence | Access method |    Size    | Description
--------+-------------------------------+-------+-------------+-------------+---------------+------------+-------------
 public | acls                          | table | XXXX | permanent   | heap          | 9232 kB    |
 public | acme_storage                  | table | XXXX | permanent   | heap          | 8192 bytes |
 public | basicauth_credentials         | table | XXXX | permanent   | heap          | 8192 bytes |
 public | ca_certificates               | table | XXXX | permanent   | heap          | 8192 bytes |
 public | certificates                  | table | XXXX | permanent   | heap          | 96 kB      |
 public | cluster_events                | table | XXXX | permanent   | heap          | 1136 kB    |
 public | clustering_data_planes        | table | XXXX | permanent   | heap          | 8192 bytes |
 public | clustering_rpc_requests       | table | XXXX | permanent   | heap          | 8192 bytes |
 public | consumers                     | table | XXXX | permanent   | heap          | 272 kB     |
 public | filter_chains                 | table | XXXX | permanent   | heap          | 8192 bytes |
 public | hmacauth_credentials          | table | XXXX | permanent   | heap          | 8192 bytes |
 public | jwt_secrets                   | table | XXXX | permanent   | heap          | 488 kB     |
 public | key_sets                      | table | XXXX | permanent   | heap          | 8192 bytes |
 public | keyauth_credentials           | table | XXXX | permanent   | heap          | 8192 bytes |
 public | keys                          | table | XXXX | permanent   | heap          | 8192 bytes |
 public | locks                         | table | XXXX | permanent   | heap          | 48 kB      |
 public | oauth2_authorization_codes    | table | XXXX | permanent   | heap          | 8192 bytes |
 public | oauth2_credentials            | table | XXXX | permanent   | heap          | 664 kB     |
 public | oauth2_tokens                 | table | XXXX | permanent   | heap          | 8192 bytes |
 public | parameters                    | table | XXXX | permanent   | heap          | 48 kB      |
 public | plugins                       | table | XXXX | permanent   | heap          | 19 MB      |
 public | ratelimiting_metrics          | table | XXXX | permanent   | heap          | 8192 bytes |
 public | response_ratelimiting_metrics | table | XXXX | permanent   | heap          | 8192 bytes |
 public | routes                        | table | XXXX | permanent   | heap          | 2760 kB    |
 public | schema_meta                   | table | XXXX | permanent   | heap          | 56 kB      |
 public | services                      | table | XXXX | permanent   | heap          | 2832 kB    |
 public | sessions                      | table | XXXX | permanent   | heap          | 8192 bytes |
 public | sm_vaults                     | table | XXXX | permanent   | heap          | 8192 bytes |
 public | snis                          | table | XXXX | permanent   | heap          | 48 kB      |
 public | tags                          | table | XXXX | permanent   | heap          | 5464 kB    |
 public | targets                       | table | XXXX | permanent   | heap          | 8192 bytes |
 public | upstreams                     | table | XXXX | permanent   | heap          | 256 kB     |
 public | workspaces                    | table | XXXX | permanent   | heap          | 48 kB      |
(33 rows)
kong_stage2=# select count(*) from plugins;
 count
-------
 28818

Maybe next step I will take the kong_stage2 and clone it, then reduce the resources in the tables by like half(cut services/routes/plugins etc.) and launch kong on that. See if no errors on startup. If still errors keep reducing resource footprint by half until the errors stop if at all. If I remove all resources and still errors out then that takes away my hypothesis that it had to do with amount of kong resources in the db tables.

@chronolaw
Copy link
Contributor

Recently we upgraded lua-resty-events library to 0.3.0 (https://github.com/Kong/lua-resty-events/releases/tag/0.3.0), but it is not included in 3.7.x now. This version fixed some potential bugs, could you try it with the master branch? thanks.

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 26, 2024

@chronolaw is the latest on luarocks(can install the latest version after the kong build w an extra luarocks install line) or can I force a sed command during kong bazel build steps to pull the newer version of it and should work ontop of 3.7.1 without further changes needed out of the box?

@chronolaw
Copy link
Contributor

chronolaw commented Jun 26, 2024

I think that this issue has no relationship with db or memory size, it seems that the events broker ( lua-resty-events) is not ready when events worker trying to connect,
lua-resty-events is not a luarocks library, perhaps we should build latest kong with bazel.

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 26, 2024

@chronolaw looks like the source ends up in:

/usr/local/openresty/lualib/resty/events/*

Path for a traditional Kong bazel build image I have.

I can just add to the image to dump that codes tag https://github.com/Kong/lua-resty-events/tree/0.3.0 and overwrite the files in that path before Kong startup and report back to yah my startup logs.

Edit - I put all the events files into that lua-resty-events patches folder and will copy it over into the events dir after cleaning it:

# Patch the events library to 0.3.0, delete original src files.
RUN rm -rf /usr/local/openresty/lualib/resty/events
COPY patches/lua-resty-events /usr/local/openresty/lualib/resty/events

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 26, 2024

Patch files added to dev w the smaller db, initial runtime looks good here:

2024-06-26T16:11:48.650551282-05:00 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /usr/local/kong/nginx.conf:10
2024-06-26T16:11:48.651024522-05:00 nginx: [warn] load balancing method redefined in /usr/local/kong/nginx.conf:129
2024-06-26T16:11:49.360159772-05:00 2024/06/26 21:11:48 [notice] 30#0: [lua] init.lua:788: init(): [request-debug] token for request debugging: 22fce1a3-7021-4054-bc31-283ab226694a
2024-06-26T16:11:49.394999758-05:00 2024/06/26 21:11:48 [notice] 30#0: using the "epoll" event method
2024-06-26T16:11:49.395065492-05:00 2024/06/26 21:11:48 [notice] 30#0: openresty/1.25.3.1
2024-06-26T16:11:49.395069540-05:00 2024/06/26 21:11:48 [notice] 30#0: OS: Linux 4.18.0-553.5.1.el8_10.x86_64
2024-06-26T16:11:49.395073507-05:00 2024/06/26 21:11:48 [notice] 30#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024-06-26T16:11:49.395097222-05:00 2024/06/26 21:11:48 [notice] 30#0: start worker processes
2024-06-26T16:11:49.397169630-05:00 2024/06/26 21:11:48 [notice] 30#0: start worker process 49
2024-06-26T16:11:49.398037252-05:00 2024/06/26 21:11:48 [notice] 30#0: start worker process 50
2024-06-26T16:11:49.399031642-05:00 2024/06/26 21:11:48 [notice] 30#0: start worker process 51
2024-06-26T16:11:49.420188119-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] broker.lua:218: init(): event broker is ready to accept connections on worker #0, context: init_worker_by_lua*
2024-06-26T16:11:49.447407811-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'services' into the core_cache..., context: init_worker_by_lua*
2024-06-26T16:11:49.484335314-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'services' into the core_cache (in 36ms), context: init_worker_by_lua*
2024-06-26T16:11:49.484370089-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'consumers' into the cache..., context: init_worker_by_lua*
2024-06-26T16:11:49.500464404-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'consumers' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:11:49.500484572-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'acls' into the cache..., context: init_worker_by_lua*
2024-06-26T16:11:49.763717951-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'acls' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:11:49.763739081-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'certificates' into the core_cache..., context: init_worker_by_lua*
2024-06-26T16:11:49.777581239-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'certificates' into the core_cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:11:49.777612247-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'jwt_secrets' into the cache..., context: init_worker_by_lua*
2024-06-26T16:11:49.790418587-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'jwt_secrets' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:11:49.790455116-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'oauth2_credentials' into the cache..., context: init_worker_by_lua*
2024-06-26T16:11:49.804057693-05:00 2024/06/26 21:11:49 [notice] 49#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'oauth2_credentials' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:11:49.830575595-05:00 2024/06/26 21:11:49 [notice] 49#0: *1316 [lua] warmup.lua:47: warming up DNS entries ..., context: ngx.timer
2024-06-26T16:11:49.837250912-05:00 2024/06/26 21:11:49 [notice] 51#0: *5 [lua] worker.lua:286: communicate(): worker #2 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:11:49.837290336-05:00 2024/06/26 21:11:49 [notice] 50#0: *658 [lua] worker.lua:286: communicate(): worker #1 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:11:49.837608975-05:00 2024/06/26 21:11:49 [notice] 49#0: *1310 [lua] broker.lua:270: run(): worker #2 connected to events broker (worker pid: 51), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:11:49.837620176-05:00 2024/06/26 21:11:49 [notice] 49#0: *1311 [lua] broker.lua:270: run(): worker #1 connected to events broker (worker pid: 50), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:11:49.837696791-05:00 2024/06/26 21:11:49 [notice] 49#0: *1313 [lua] worker.lua:286: communicate(): worker #0 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:11:49.837853305-05:00 2024/06/26 21:11:49 [notice] 49#0: *1958 [lua] broker.lua:270: run(): worker #0 connected to events broker (worker pid: 49), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:12:14.444972971-05:00 2024/06/26 21:12:14 [notice] 49#0: *1316 [lua] warmup.lua:81: finished warming up DNS entries' into the cache (in 24615ms), context: ngx.timer

Now let me try it against the bigger database with more resources where I see the problem in the first place.

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 26, 2024

Same issues persist even with the 0.3.0 patch files dropped in with the bigger DB(stage database) and resources:

2024-06-26T16:33:36.307545718-05:00 nginx: [warn] load balancing method redefined in /usr/local/kong/nginx.conf:129
2024-06-26T16:33:38.088352910-05:00 2024/06/26 21:33:36 [notice] 30#0: [lua] init.lua:788: init(): [request-debug] token for request debugging: 54ee0f5d-3690-492d-8749-0437d9d3da7e
2024-06-26T16:33:38.226529883-05:00 2024/06/26 21:33:36 [notice] 30#0: using the "epoll" event method
2024-06-26T16:33:38.226549148-05:00 2024/06/26 21:33:36 [notice] 30#0: openresty/1.25.3.1
2024-06-26T16:33:38.226551984-05:00 2024/06/26 21:33:36 [notice] 30#0: OS: Linux 4.18.0-553.5.1.el8_10.x86_64
2024-06-26T16:33:38.226554719-05:00 2024/06/26 21:33:36 [notice] 30#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024-06-26T16:33:38.226575297-05:00 2024/06/26 21:33:36 [notice] 30#0: start worker processes
2024-06-26T16:33:38.229627113-05:00 2024/06/26 21:33:36 [notice] 30#0: start worker process 48
2024-06-26T16:33:38.231224469-05:00 2024/06/26 21:33:36 [notice] 30#0: start worker process 49
2024-06-26T16:33:38.232770851-05:00 2024/06/26 21:33:36 [notice] 30#0: start worker process 50
2024-06-26T16:33:38.234460179-05:00 2024/06/26 21:33:36 [notice] 30#0: start worker process 51
2024-06-26T16:33:38.268040005-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] broker.lua:218: init(): event broker is ready to accept connections on worker #0, context: init_worker_by_lua*
2024-06-26T16:33:38.344812017-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'services' into the core_cache..., context: init_worker_by_lua*
2024-06-26T16:33:38.638481995-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'services' into the core_cache (in 293ms), context: init_worker_by_lua*
2024-06-26T16:33:38.638498075-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'consumers' into the cache..., context: init_worker_by_lua*
2024-06-26T16:33:38.685365008-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'consumers' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:33:38.685381779-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'acls' into the cache..., context: init_worker_by_lua*
2024-06-26T16:33:43.446588073-05:00 2024/06/26 21:33:43 [error] 50#0: *6 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:43.452284416-05:00 2024/06/26 21:33:43 [error] 49#0: *276 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:43.476356347-05:00 2024/06/26 21:33:43 [error] 51#0: *1301 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:48.468408174-05:00 2024/06/26 21:33:48 [error] 49#0: *1964 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:48.494021256-05:00 2024/06/26 21:33:48 [error] 51#0: *1966 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:48.497310485-05:00 2024/06/26 21:33:48 [error] 50#0: *1968 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:53.485180350-05:00 2024/06/26 21:33:53 [error] 49#0: *1970 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:53.525941146-05:00 2024/06/26 21:33:53 [error] 51#0: *1972 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:53.537179625-05:00 2024/06/26 21:33:53 [error] 50#0: *1974 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:58.509082661-05:00 2024/06/26 21:33:58 [error] 49#0: *1976 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:58.539892536-05:00 2024/06/26 21:33:58 [error] 51#0: *1978 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:33:58.549879163-05:00 2024/06/26 21:33:58 [error] 50#0: *1980 [lua] worker.lua:270: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer
2024-06-26T16:34:02.243400980-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'acls' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:34:02.243434763-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'certificates' into the core_cache..., context: init_worker_by_lua*
2024-06-26T16:34:02.258343491-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'certificates' into the core_cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:34:02.258370020-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'jwt_secrets' into the cache..., context: init_worker_by_lua*
2024-06-26T16:34:02.336967554-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'jwt_secrets' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:34:02.336997960-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:114: single_dao(): Preloading 'oauth2_credentials' into the cache..., context: init_worker_by_lua*
2024-06-26T16:34:02.427902024-05:00 2024/06/26 21:33:38 [notice] 48#0: *1 [lua] warmup.lua:163: single_dao(): finished preloading 'oauth2_credentials' into the cache (in 0ms), context: init_worker_by_lua*
2024-06-26T16:34:02.801755430-05:00 2024/06/26 21:34:02 [notice] 48#0: *2007 [lua] warmup.lua:47: warming up DNS entries ..., context: ngx.timer
2024-06-26T16:34:02.808827066-05:00 2024/06/26 21:34:02 [error] 48#0: *1991 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.808906674-05:00 2024/06/26 21:34:02 [error] 48#0: *1992 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.808939996-05:00 2024/06/26 21:34:02 [error] 48#0: *1993 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.808944024-05:00 2024/06/26 21:34:02 [error] 48#0: *1994 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.808958511-05:00 2024/06/26 21:34:02 [error] 48#0: *1995 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.808964943-05:00 2024/06/26 21:34:02 [error] 48#0: *1996 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.809024363-05:00 2024/06/26 21:34:02 [error] 48#0: *1997 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.809032008-05:00 2024/06/26 21:34:02 [error] 48#0: *1998 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.809046935-05:00 2024/06/26 21:34:02 [error] 48#0: *1999 [lua] broker.lua:238: run(): failed to init socket: failed to flush response header: nginx output filter error, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.809491594-05:00 2024/06/26 21:34:02 [notice] 51#0: *1984 [lua] worker.lua:286: communicate(): worker #3 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:34:02.809524215-05:00 2024/06/26 21:34:02 [notice] 49#0: *1982 [lua] worker.lua:286: communicate(): worker #1 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:34:02.809529154-05:00 2024/06/26 21:34:02 [notice] 50#0: *1986 [lua] worker.lua:286: communicate(): worker #2 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:34:02.809826257-05:00 2024/06/26 21:34:02 [notice] 48#0: *2001 [lua] broker.lua:270: run(): worker #3 connected to events broker (worker pid: 51), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.810070032-05:00 2024/06/26 21:34:02 [notice] 48#0: *2002 [lua] broker.lua:270: run(): worker #2 connected to events broker (worker pid: 50), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.810621239-05:00 2024/06/26 21:34:02 [notice] 48#0: *2000 [lua] broker.lua:270: run(): worker #1 connected to events broker (worker pid: 49), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:02.810637489-05:00 2024/06/26 21:34:02 [notice] 48#0: *2004 [lua] worker.lua:286: communicate(): worker #0 is ready to accept events from unix:/usr/local/kong/worker_events.sock, context: ngx.timer
2024-06-26T16:34:02.810644612-05:00 2024/06/26 21:34:02 [notice] 48#0: *2649 [lua] broker.lua:270: run(): worker #0 connected to events broker (worker pid: 48), client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:07.808752272-05:00 2024/06/26 21:34:07 [error] 48#0: *1988 [lua] broker.lua:238: run(): failed to init socket: failed to read worker info: failed to receive the header bytes: timeout, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:07.808826561-05:00 2024/06/26 21:34:07 [error] 48#0: *1989 [lua] broker.lua:238: run(): failed to init socket: failed to read worker info: failed to receive the header bytes: timeout, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"
2024-06-26T16:34:07.808854232-05:00 2024/06/26 21:34:07 [error] 48#0: *1990 [lua] broker.lua:238: run(): failed to init socket: failed to read worker info: failed to receive the header bytes: timeout, client: unix:, server: kong_worker_events, request: "GET / HTTP/1.1", host: "localhost"

After those error logs things start to seem to run normal.

cc @chronolaw @bungle

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jun 27, 2024

Also if yall want me to hack any debug statements into the code to get a better understanding of whats happening I can. Just gotta give me the lines you may be curious about where to drop them in.

Am also curious if the consensus opinion is that this is mostly harmless for it to be doing this on startup and I am okay to take these errors into a production environment or if I should hold off. Normally when I see [errors] and miscommunications early on in something that would be an event distribution library for managing all kongs intra worker comms I would think thats a big issue, but if its just an early timing thing and kong is all heathy right after those logs are done to not spit it out anymore then I suppose no issue taking it to prod if its just a minor startup comms issue of things not being ready when called etc. Would like yalls opinion there tho.

Few other ideas I may have to see if helps stop the error logs:

  1. stop doing health checks on the container that force a DB PUT to a upstreams resource early on in my liveness/readiness probes. Maybe something about admin api calls early on in a larger kong deployment can cause issues there.

... any other ideas ill add here to this list.

@chronolaw
Copy link
Contributor

Very detailed log data, thank you @jeremyjpj0916 , we will check these data, please give us more time.

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Jul 3, 2024

@chronolaw thanks for staying on top of this error that preventing us from moving to newest Kong thus far.

Do you have any opinion on if this is something safe to upgrade to and allow to happen on startup until its later fixed? Having been running latest kong in our Stage environment for a week+ now I am not hearing any other reports of issues from it, but I never like error logs popping up in my runtimes anyways but if yall agree these errors are pretty harmless since they go away after startup then I can proceed with my upgrades.

@chronolaw
Copy link
Contributor

Yes, I think that these error log message should only happen in start stage, the root reason is the special pub/sub mechanism of events (see: https://konghq.com/blog/engineering/nginx-openresty-event-handling-strategy-for-cpu-efficiency), the broker is not ready but other worker are trying to connect it.

Once the broker is ready and workers connect to the broker successfully, the whole events system will work well to serve kong gateway.

We will still trying to enhance the robustness to reduce these noise.

@jeremyjpj0916
Copy link
Contributor Author

Sounds good then to proceed and just ignore the errors if they start early but stop right after kong startup. Will continue with upgrading our Kong instances then in production too. Thx.

Hopefully yall figure out a way to make the workers patiently wait on the broker or broker start before the workers attempt to connection so orderings are all right then these error print logs won't happen.

@chronolaw
Copy link
Contributor

chronolaw commented Jul 4, 2024

Ticket KAG-4867 was created for tracking this issue.

@jeremyjpj0916
Copy link
Contributor Author

Good stuff, hopefully yall can reproduce it.

@rbond-vetsource
Copy link

getting the same error using db less yaml config on docker image kong:3.7.1:

nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /usr/local/kong/nginx.conf:7

@RahulVerma989
Copy link

Hi Guys,

Getting similar logs when running kong using docker.

2024-07-19 13:26:16 nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /usr/local/kong/nginx.conf:7
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: [lua] init.lua:788: init(): [request-debug] token for request debugging: 308b3370-1484-47cd-94c4-e480c2761aea
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: using the "epoll" event method
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: openresty/1.25.3.1
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: OS: Linux 6.6.32-linuxkit
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker processes
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1337
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1338
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1339
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1340
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1341
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1342
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1343
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1329#0: start worker process 1344
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1337#0: *1 [lua] warmup.lua:113: single_dao(): Preloading 'services' into the core_cache..., context: init_worker_by_lua*
2024-07-19 13:26:16 2024/07/19 07:56:16 [notice] 1337#0: *1 [lua] warmup.lua:162: single_dao(): finished preloading 'services' into the core_cache (in 0ms), context: init_worker_by_lua*

This is my docker-compose.yml file code:

services:
  kong-database:
    image: postgres:latest
    networks:
      - kong-net
    environment:
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_PASSWORD: ${KONG_PG_PASSWORD}
    restart: on-failure
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "kong"]
      interval: 10s
      timeout: 5s
      retries: 5
    volumes:
      - kong_db_data:/var/lib/postgresql/data

  kong-migrations:
    image: j2m_kong_gateway_service:latest
    networks:
      - kong-net
    depends_on:
      kong-database:
        condition: service_healthy
    environment:
      KONG_DATABASE: postgres
      KONG_PG_HOST: kong-database
      KONG_PG_PASSWORD: ${KONG_PG_PASSWORD}
    command: "kong migrations bootstrap"
    restart: on-failure

  kong-gateway:
    image: j2m_kong_gateway_service:latest
    networks:
      - kong-net
    depends_on:
      kong-database:
        condition: service_healthy
      kong-migrations:
        condition: service_completed_successfully
    environment:
      KONG_DATABASE: postgres
      KONG_PG_HOST: kong-database
      KONG_PG_USER: kong
      KONG_PG_PASSWORD: ${KONG_PG_PASSWORD}
      KONG_PROXY_ACCESS_LOG: /dev/stdout
      KONG_ADMIN_ACCESS_LOG: /dev/stdout
      KONG_PROXY_ERROR_LOG: /dev/stderr
      KONG_ADMIN_ERROR_LOG: /dev/stderr
      KONG_ADMIN_LISTEN: "127.0.0.1:8001, 127.0.0.1:8443 ssl"
      KONG_ADMIN_GUI_URL: ${KONG_ADMIN_GUI_URL}
    ports:
      - "127.0.0.1:8000:8000"
      - "127.0.0.1:8443:8443"
      - "127.0.0.1:8001:8001"
      - "127.0.0.1:8002:8002"
      - "127.0.0.1:8444:8444"
      - "127.0.0.1:8445:8445"
      - "127.0.0.1:8003:8003"
      - "127.0.0.1:8004:8004"
    restart: on-failure
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8001/services"]
      interval: 30s
      timeout: 10s
      retries: 5
    command: ["kong", "start"]

networks:
  kong-net:
    name: kong-net

volumes:
  kong_db_data:

Any solutions?

@ncuhe
Copy link

ncuhe commented Sep 2, 2024

问题

worker.lua:138: communicate(): failed to connect: failed to receive response header: timeout, context: ngx.timer

解决方案:

在k8s或docker启动的时候,添加下面的参数:

  - name: KONG_NGINX_WORKER_PROCESSES
    value: "4"

分析原因:

第一步: 这个报错如果给4G的内存,就不会出现了,这导致每一个kong容器需要4G的资源,比较浪费,在之前的项目中,kong一般在1G之内(之前的node节点一般是8C16G)。

第二步: 内存暂用比较多,应该是线程多导致的,进入docker容器,发现有将近60个 nginx: worker process,这应该和我使用的一台64核服务器有关,到这一步大概知道了解决办法了,修改默认值,减少线程数量。

第三步:修改WORKER_PROCESSES,修改为4线程,降低内存的占用率。在没有限制之前kong启动需要4G的内存,修改为4个线程,内存长期在300M左右,问题基本解决。根据之后的压力测试,再调整实际业务需要的线程数。

我使用的docker镜像版本 kong:3.7.1

1725293439704

1725293533141

@jeremyjpj0916
Copy link
Contributor Author

jeremyjpj0916 commented Sep 3, 2024

I don't think thats my problem ^ , I already run really large pods and still see the error on startup:

          resources:
            limits:
              cpu: '4'
              memory: 10Gi
            requests:
              cpu: '4'
              memory: 10Gi

And I already set it here too for above example env to ensure right process count:

            - name: KONG_NGINX_MAIN_WORKER_PROCESSES
              value: '4'

Interesting that you managed to reproduce it in a scenario where Kong is starved from resources it needs to run proper though, somewhat telling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants