Replies: 7 comments 17 replies
-
We've spent the day trying all sorts of things, but we still have issues. We bumped up backpressure, that blew up our database (we ran out of connections). We tried less workers, more workers, all sorts of inbetween configurations. I think the hardest part is not knowing if granian is contributing to the latency at this point or if we're just wasting our time tuning the wrong things. Are there any metrics or logs (debug logs?) that would help us determine if the io threads/backpressure is too low? My next plan is to get some continuous profiler running so I can at least see what's going on under the hood. |
Beta Was this translation helpful? Give feedback.
-
I can't think of anything specific which can increase latency 10x compared to gUnicorn (in fact, every benchmark suggests otherwise). To me sounds like you're overloading the database, so instead of increasing backpressure I would go the other way around and pick very low numbers (4-16 range). If that doesn't help, the other possible way would be to limit the blocking threads to a low number as well, but I would first try with a low backpressure. |
Beta Was this translation helpful? Give feedback.
-
I stumbled upon a similar issue while load-testing my app, and found that the load is not evenly distributed among workers - more workers and higher My test import asyncio
import os
from granian import Granian
requests = 0
# We need atomic writes to keep logging nice
def log(*args, **kwargs): os.write(1, (" ".join(args) + "\n").encode())
async def app(scope, receive, send):
global requests
if scope['type'] == 'lifespan':
while True:
message = await receive()
if message['type'] == 'lifespan.startup':
log("Starting up...")
await send({'type': 'lifespan.startup.complete'})
elif message['type'] == 'lifespan.shutdown':
log(f"Shutting down ({requests} requests processed)...")
await send({'type': 'lifespan.shutdown.complete'})
return
else:
raise RuntimeError("Unexpected type of message")
assert scope['type'] == 'http'
# Process request (kind of)
requests += 1
# Simulate waiting for something asynchronously
await asyncio.sleep(0.01)
# Simulate a bit of CPU load (adjust to taste)
s = 0
for n in range(10000):
s += n
try:
await send({
'type': 'http.response.start',
'status': 200,
})
await send({
'type': 'http.response.body',
'body': b'Are we there yet?\n',
})
except BaseException as e:
log(f"Processing request failed on send: {e!r}")
raise
if __name__ == "__main__":
log("Running manager process")
manager = Granian(
target=__file__,
interface='asgi',
workers=4,
respawn_failed_workers=True,
#backpressure=10,
respawn_interval=1
)
log(f"Starting manager: {manager}")
manager.serve() I test it with h2load:
So at least one worker is getting minimum requests while another one is also starving a bit. With 8 workers things get worse:
We have only 3 which are handling most requests while others are doing much less. And now with 16 workers:
Results are consistent across multiple runs. The test system is idling during the tests (i.e. loaded only by the test itself). I would expect that requests are sent in round-robin fashion to every worker which is serving less than Lowering |
Beta Was this translation helpful? Give feedback.
-
@gi0baro Sure I didn't expect that while True:
if active_requests < backpressure:
await accept_request()
else:
await wait_for_some_requests_to_finish() However, I have an impression from your comment that every worker is doing its own |
Beta Was this translation helpful? Give feedback.
-
Now I am a bit puzzled. In pure C, with rudimentary HTTP simulator, just reading the request and returning static response, using bind()/listen() and epoll() for event handling in each forked worker, including backpressure simulation (ignoring connection events when concurrency is reached), I have even load distribution regardless of backlog and client's concurrency - so this does not look like a kernel issue. If bind()/listen() is done once (i.e. listen fd is shared among workers) then I have results similar to reported above - however, in your code I see that each worker has its own listener (I am not that familiar with tokio though). But indeed, uvicorn also suffers from this problem - so it is not Granian specific either. |
Beta Was this translation helpful? Give feedback.
-
@apenney since v2.2.2 Granian is properly distributing load among all workers on Linux and FreeBSD. If you are on one of those systems - please try again your test and tell us if this solves your issue. Also a general recommendation (based on tests) - keep One other thing which you could try is to disable keep-alives (unless you really need i) - this also helps to distribute load evenly among workers, thus reducing latency. |
Beta Was this translation helpful? Give feedback.
-
@aldem could you elaborate on this "keep backpressure as low as possible (the default based on backlog and number or workers is ok)" ? Any example ? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi!
We're looking at switching over from Gunicorn to Granian but we ran into a weird performance issue. We're running Django 4.x, and in our development environment baseline latency has gone up from 6ms to 60ms.
This was weird, so I started digging into our apm traces and found one of our middlewares (something custom) has gone from an average of 6ms to 60ms.
It's got some unexciting code that looks like:
At this point I couldn't find any obvious reasons for issues, so I started to dig into blocking threads, stuff like that, trying to find anything to tune. From my understanding for WSGI the best option for us is:
At that point I would have had backpressure=128 by the default calculation, so I then bumped this to 512 to see if it would help but nothing changed about the latency profile. (We don't use a db connection pool, we just let django do it's thing, so I couldn't scale them based on that).
Questions:
The latency hike was pretty noticeable on the charts so I'm hesitant to try and roll this up into environments with more traffic until I better understand what's going on here. Any suggestions would be gratefully appreciated!
Beta Was this translation helpful? Give feedback.
All reactions