-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Software caused connection abort on network disconnection #173
Comments
Software caused connection abort
on network disconnection
After debugging, I discovered the culprit. Overall, this is consistent with the documentation:
Summarizing the Krossbow should catch such errors or not allow their occurrence. |
Thanks a lot for the report, and for the investigation.
This seems appropriate.
I'm not sure I'm following. It seems legitimate to me that trying to send frames on a closed session throws an exception. Do you mean that an exception can occur due to automatically sent frames without programmer error? |
I include sample code:
The reason is sending hearthbeat after socket is aborted and it causes the error:
In my opinion, heartbeat should not be sent when socket is corrupted. To catch such exceptions for this moment i can use CoroutineExceptionHandler inside launch method. |
Got your point, thanks a lot for the sample code. It is indeed incorrect for Krossbow to try to send heart beats when the session is already closed. I will look into that. The |
The worst of this bug is it that reconnect mechanism does not recognize such a case. Even when you call |
I'm confused, does this happen with Ktor or OkHttp? The stack trace points to OkHttp (without any Krossbow methods), but you mention Ktor's Also, it'd be great to be able to reproduce this outside of Android's environment, so we could write a test case for it. Last but not least, do you reproduce the issue in Krossbow 3.1.0? |
Yes, it is reproducible on 3.1.0.
But for me it shouldn't matter, probably it can be reproduced on the plain OkHTTP without ktor.
If I disable HearthBeat action SocketException in this case it does not appear anymore. Simplest hack-workaround for that is global handler can also be added and ignore this exception like that:
|
Thanks a lot for the details. Ok so you're using
If you confirmed this by debugging then that's ok, thanks. I just didn't get why the stacktrace you provided wouldn't include that. Indeed if an error on the socket occurs, the heart beat shouldn't be sent. I can think of 2 possibilities where this could happen in 3.1.0 (maybe there are more):
To prevent the first one, we could cancel the heart beat job early, but that would still be racing with the error bubbling. So it would still be possible that the heart beat kicks in while the error callbacks haven't reached the point where the session is aborted and all internal coroutines cancelled. In order to prevent heart beat failures, we could check if the websocket is still open before attempting to send a heart beat ( As a side note, both val webSocketClient = KtorWebSocketClient(httpClient).withAutoReconnect {
maxAttempts = Int.MAX_VALUE
}
val client = StompClient(webSocketClient) {
heartBeat = HeartBeat(5.seconds, 13.seconds)
} |
hello @joffrey-bion . I am facing this exact same issue in version 7.0.0 This was resolved ? I have my heartbeats set and I was testing a few things in my android application. My backend is using spring boot I rightfully show on the logs of my spring boot backend that the heartbeats stopped appearing and the server destroyed my session. When I bought my application back to live , the application crashed with the "Software caused connection abort". I was able to hit this breakpoint before the app crashed
The app control was at " |
Yes this should have been fixed a long time ago. Thanks for reporting that it happens again, and for the details to help reproduce it. I reopened this issue to have another look. Could you please share a little more details about which artifacts you were using? Are you using Ktor with the OkHttp engine? Or OkHttp directly? |
Oh yes , I am actually using Ktor engine . |
Thanks! And did you have a chance to try this experiment with the latest version of Krossbow? |
yes indeed, my krossbow version is krossbow-version = "7.0.0" |
If it's not too much hassle, could you please confirm that you can still reproduce with Krossbow 8.0.0? I don't think the changes in Krossbow 8 should change this specific behaviour, but I'd just like to be sure to avoid investigating a ghost :) |
@joffrey-bion that is correct. Let me test this out with version 8 and revert back to you |
@joffrey-bion surprisingly , that issue is not appearing in 8.0.0 despite your comment that you didn't change anything that was relevant to this issue . Lets chalk this one up to some random act of God :D . You can close the issue while I monitor more closely and would revert back to you if i see some unexpected behavior . Thanks again |
Thank you so much for testing this! Ok I'll close for now, and we'll reopen if it pops up again. |
hey @joffrey-bion so I was able to consistently throw this exception . It maybe occurs when the coroutine scope has been already cancelled (for example because of network disconnection), yet we try to access it via calling some of the stompSessions functions (this is just a hypothesis) I have a code snippet which can always throw this crash. Keep in mind this code is just to illustrate the situation and we do not do this in production. However that been said I still feel that this exception should somehow be caught . Here is the code
This is how we connect to stomp
So what I have understood so far is that if a catch has invoked , there needn't any reason to call disconnect , since an abnormal termination is indicative that a disconnection has already taken place. That been said maybe you can make more sense of why this occurs given the above ? I just posted here because last time we spoke we were a bit unsure as to how to reproduce this . Thanks |
Thank you so much for sharing this. I'll reopen the issue so I can take a look when I have some free time. |
Describe the bug
During a websocket connection when user turn off wifi connection he got "Software caused connection abort" exception. This exception is uncatchable so it cause application crash. I'm debugging which component is causing this error but it's pretty hard if anyone could confirm its occurrence it was great.
Reproduction and additional details
Context
Artifacts used:
krossbow-stomp-core
krossbow-stomp-jackson
krossbow-stomp-kxserialization
krossbow-websocket-core
krossbow-websocket-sockjs
krossbow-websocket-spring
krossbow-websocket-okhttp
krossbow-websocket-ktor
The text was updated successfully, but these errors were encountered: