-
-
Notifications
You must be signed in to change notification settings - Fork 109
Description
Hi,
so I've spent the better part of yesterday to understand why all my WSGI apps running in Granian running in Docker running in Nomad behind HAProxy always get SIGKILLED after a timeout although they work just fine locally – both on macOS and within a local Docker container.
It was very frustrating, because it turned out, that I ran into two independent problems. I wanted to share my findings so that my time wasn't entirely wasted and maybe suggest at least some documentation fixes.
Problem 1: Health checks & HTTP/2
The symptom here is:
[INFO] Shutting down granian
[INFO] Stopping worker-1
And then the worker just hangs forever not accepting new requests, unless I set GRANIAN_WORKERS_KILL_TIMEOUT
.
This was my original suspicion based on #568 and #548. And I got it solved by setting GRANIAN_HTTP = "1"
and GRANIAN_HTTP1_KEEP_ALIVE = "no"
.
What's puzzling here is: is it possible at all to have HTTP/2 + load balancer with health checks and graceful shutdowns/restarts?
If no, it should be clearly in the docs, if yes, please tell me how and put it into the docs. 😅
Problem 2: Weird interaction with a database driver
This one is weirder and caused by the bane of my existence: sqlanydb
The symptom is similar, but slightly different – I get only one line:
[INFO] Shutting down granian
and Granian happily does keep serving requests as if nothing happened.
I know it's the driver's fault, because if I run a local Docker container with the same app and only hit my heartbeat endpoint, I can kill it just fine.
But if I hit an endpoint that tries to connect to a database and therefore initializes the driver, I get the same behavior as in prod: it keeps serving until SIGKILLed.
And I have no solution to this except using Gunicorn where it works just fine.
I'm not sure what to do about this and I'm not sure if you're interested at all to figure out why a driver with less than 10k downloads per month is wrecking havoc like this. But I would suspect that other ctypes-based packages might have the same problem? Most people don't care about the shutdowns of their apps, but I do and I also run cleanups in atexit
handlers so I'm like the 1% who would notice that.
Let me know either way, if this is something you'd be interested in pursuing. As said: you don't need a SQL Anywhere server, just the client driver which is also available here: https://help.sap.com/docs/SUPPORT_CONTENT/sqlany/3362971128.html. I could not reproduce the problem on macOS, tho. Just Linux in Docker (didn't try Linux without Docker)
granian 2.3.4, Python 3.13.5, Ubuntu Noble, mostly Flask, but also tried one Litestar