Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement detection of dead counterparty even if websocket connection is still alive #2072

Open
jmartisk opened this issue Apr 3, 2024 · 3 comments

Comments

@jmartisk
Copy link
Member

jmartisk commented Apr 3, 2024

Currently, the implementation is that the server (if it's based on smallrye-graphql) sends PING messages periodically, but doesn't care whether any PONG arrives. Client doesn't send any PING messages at all.

If the server dies during an active subscription, but is hidden behind a proxy, the proxy might keep the websocket connection alive, and the client will stay connected forever, not detecting that the other side is dead.

@ITrium-Salah
Copy link

@jmartisk Just to inform you, but I re-performed more in-depth tests on my code, and with the management of failures I forgot the case of onComplete.
When killing or stopping my docker services I actually have the disconnection detected but via onComplete and not onFailure.

Note that for the case of an unplugged network wire for example (when the system don't clear the resources of the process and don't send a TCP FIN) the disconnection will be detected but after the defaut OS TCP KeepAliveTime (2 hours for Window).
Even if my problem was not one, i think that an application detection of ping/pong, with the possibility of configuring it by property (enabled, transmission interval, delay, ..) will be welcolme

@jmartisk
Copy link
Member Author

jmartisk commented Apr 5, 2024

Ah ok, thanks for the clarification. Yeah if the OS kills the connection after some time, that's good. We could implement some custom timeouting, but we will also need to be careful we don't, for example, kill a connection that is idle because of an attached debugger (rather than dead server) - which would be annoying.

It's low priority then, but let's keep it open, maybe some more people will bring in some more ideas.

@ITrium-Salah
Copy link

For the client side yes clearly low priority cause we can configure the OS tcp keepalive time by settings:
Windows:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters KeepAlive
Linux:
net.ipv4.tcp_keepalive_time, net.ipv4.tcp_keepalive_probes,net.ipv4.tcp_keepalive_intvl

A simple (i think) solution can be to implement in the client side the same logic that the server use, by sending every 5s a ping (ignoring the pong response) the dead tcp will be detected at the write (no TCP ack). For the debug breakpoint case the tcp stack will acknowledge the ping even if the process is idle.

For the server side it would be a plus to limit the consumption of resources, graphql is used has api gateway most of the time and keeping the websocket opened if the client is idle or bugged can lead to an overconsumption of resources. That case can happen even with the ping/pong but at least the protocol level has made his job.

To summarize, whether on the client or server side, as a user of the library you would need to be able to define the behavior on sending pings or not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants