Skip to content

Graceful shutdowns considered… almost impossible? #611

@hynek

Description

@hynek

Hi,

so I've spent the better part of yesterday to understand why all my WSGI apps running in Granian running in Docker running in Nomad behind HAProxy always get SIGKILLED after a timeout although they work just fine locally – both on macOS and within a local Docker container.

It was very frustrating, because it turned out, that I ran into two independent problems. I wanted to share my findings so that my time wasn't entirely wasted and maybe suggest at least some documentation fixes.

Problem 1: Health checks & HTTP/2

The symptom here is:

[INFO] Shutting down granian
[INFO] Stopping worker-1

And then the worker just hangs forever not accepting new requests, unless I set GRANIAN_WORKERS_KILL_TIMEOUT.

This was my original suspicion based on #568 and #548. And I got it solved by setting GRANIAN_HTTP = "1" and GRANIAN_HTTP1_KEEP_ALIVE = "no".

What's puzzling here is: is it possible at all to have HTTP/2 + load balancer with health checks and graceful shutdowns/restarts?

If no, it should be clearly in the docs, if yes, please tell me how and put it into the docs. 😅

Problem 2: Weird interaction with a database driver

This one is weirder and caused by the bane of my existence: sqlanydb

The symptom is similar, but slightly different – I get only one line:

[INFO] Shutting down granian

and Granian happily does keep serving requests as if nothing happened.


I know it's the driver's fault, because if I run a local Docker container with the same app and only hit my heartbeat endpoint, I can kill it just fine.

But if I hit an endpoint that tries to connect to a database and therefore initializes the driver, I get the same behavior as in prod: it keeps serving until SIGKILLed.

And I have no solution to this except using Gunicorn where it works just fine.

I'm not sure what to do about this and I'm not sure if you're interested at all to figure out why a driver with less than 10k downloads per month is wrecking havoc like this. But I would suspect that other ctypes-based packages might have the same problem? Most people don't care about the shutdowns of their apps, but I do and I also run cleanups in atexit handlers so I'm like the 1% who would notice that.

Let me know either way, if this is something you'd be interested in pursuing. As said: you don't need a SQL Anywhere server, just the client driver which is also available here: https://help.sap.com/docs/SUPPORT_CONTENT/sqlany/3362971128.html. I could not reproduce the problem on macOS, tho. Just Linux in Docker (didn't try Linux without Docker)


granian 2.3.4, Python 3.13.5, Ubuntu Noble, mostly Flask, but also tried one Litestar

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions