If I use two or more claude code sessions in parallel and re-route via peering to other providers, e.g. Anthropic and Z.AI, I think the llama-swap endpoint doesn't handle these multiple requests very well. Does this ring any bells, is there any current known limitation in llama-swap?
For example, if I had a huge tool call list and which would make llama-swap (probably) block on prompt processing, do other requests go through?