ktor websocket client crashes with java.io.IOException: Software caused connection in Android abort

Created on 18 Jul 2019  Â·  16Comments  Â·  Source: ktorio/ktor

Version: 1.2.2, client, io.ktor:ktor-client-cio, io.ktor:ktor-client-websockets
Android 7.0

When using websocket client

To Reproduce
code:

val uri = URLBuilder().takeFrom(configuration.url).build()
val httpClientEngine = CIO.create { }
val client = HttpClient(httpClientEngine).config {
    install(WebSockets)
}

suspend fun process(session: DefaultClientWebSocketSession): ConnectionStatus {
            while (session.isActive && !session.closeReason.isCompleted) {
                try {
                    select<Unit> {
                         session.incoming.onReceiveOrNull { frame ->
                            when (frame) {
                                null -> { // channel has been closed
                                    println("closed")
                                    session.close()
                                }
                                else -> Unit
                            }
                        }
                    }
                } catch (e: IOException) {
                    println("io exception: ${e}")
                    session.close()
                } catch (e: Throwable) {
                    println("exception: ${e}")
                    session.close()
                }
            }
}
runBlocking {
  try {
    when (uri.protocol.isSecure()) {
        true -> client.wss(host = uri.host, port = uri.port, path = uri.encodedPath) { process(this) }
        false -> client.ws(host = uri.host, port = uri.port, path = uri.encodedPath) { process(this) }
    }
  } catch (ex: Exception) {
    println("catched: ${ex}")
  }
}
  1. Start code with internet enabled and with some websocket url

    1. disconnect internet

      IOException will be thrown, but not caught anywhere.

Expected behavior
I'm expecting that this IOException won't leak and can be caught by try/catch.

bug

Most helpful comment

Investigation history:

maybe problem is here:
Socket.tls(coroutineContext: CoroutineContext, config: TLSConfig) function:

  1. calls val reader = openReadChannel()
  2. which calls attachForReading(channel: ByteChannel): WriterJob
  3. which calls CoroutineScope.attachForReadingDirectImpl(...) // <--- this is implemented in internal abstract class NIOSocketImpl with CoroutineScope
  4. which call writer(Dispatchers.Unconfined, channel) { .. block .. }
  5. and it this block there is "val rc = nioChannel.read(buffer)" <<- which throws that IOException
    code:
internal abstract class NIOSocketImpl<out S> implements CoroutineScope:

    override val socketContext: CompletableJob = Job()

    override val coroutineContext: CoroutineContext
        get() = socketContext
  1. and attachForReadingDirectImpl() is called on this coroutineScope which is just Job() without any specified dispatchers nor any parent contexts, so this scope is not bound to any of coroutines nor callees of tls() nor ktor CIO context, this just runs on Dispatchers.Default with Job()
    So there is no way to catch this exception and its bubbled to global exception handler.
  2. app fatally crashes.

All 16 comments

Stack trace:

E/com.eld.android.App: [loggingDispatcher] Software caused connection abort
    java.io.IOException: Software caused connection abort
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:382)
        at kotlinx.io.nio.ChannelsKt.read(Channels.kt:117)
        at io.ktor.network.sockets.CIOReaderKt$attachForReadingDirectImpl$1$1.invokeSuspend(CIOReader.kt:75)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.ResumeModeKt.resumeMode(ResumeMode.kt:67)
        at kotlinx.coroutines.DispatchedKt.resume(Dispatched.kt:309)
        at kotlinx.coroutines.DispatchedKt.resumeUnconfined(Dispatched.kt:49)
        at kotlinx.coroutines.DispatchedKt.dispatch(Dispatched.kt:295)
        at kotlinx.coroutines.CancellableContinuationImpl.dispatchResume(CancellableContinuationImpl.kt:250)
        at kotlinx.coroutines.CancellableContinuationImpl.resumeImpl(CancellableContinuationImpl.kt:260)
        at kotlinx.coroutines.CancellableContinuationImpl.resumeWith(CancellableContinuationImpl.kt:189)
        at io.ktor.network.selector.SelectorManagerSupport.handleSelectedKey(SelectorManagerSupport.kt:83)
        at io.ktor.network.selector.SelectorManagerSupport.handleSelectedKeys(SelectorManagerSupport.kt:63)
        at io.ktor.network.selector.ActorSelectorManager.process(ActorSelectorManager.kt:74)
        at io.ktor.network.selector.ActorSelectorManager$process$1.invokeSuspend(ActorSelectorManager.kt)
        at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
        at kotlinx.coroutines.DispatchedTask.run(Dispatched.kt:238)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
        at java.lang.Thread.run(Thread.java:762)

Investigation history:

maybe problem is here:
Socket.tls(coroutineContext: CoroutineContext, config: TLSConfig) function:

  1. calls val reader = openReadChannel()
  2. which calls attachForReading(channel: ByteChannel): WriterJob
  3. which calls CoroutineScope.attachForReadingDirectImpl(...) // <--- this is implemented in internal abstract class NIOSocketImpl with CoroutineScope
  4. which call writer(Dispatchers.Unconfined, channel) { .. block .. }
  5. and it this block there is "val rc = nioChannel.read(buffer)" <<- which throws that IOException
    code:
internal abstract class NIOSocketImpl<out S> implements CoroutineScope:

    override val socketContext: CompletableJob = Job()

    override val coroutineContext: CoroutineContext
        get() = socketContext
  1. and attachForReadingDirectImpl() is called on this coroutineScope which is just Job() without any specified dispatchers nor any parent contexts, so this scope is not bound to any of coroutines nor callees of tls() nor ktor CIO context, this just runs on Dispatchers.Default with Job()
    So there is no way to catch this exception and its bubbled to global exception handler.
  2. app fatally crashes.

Very similar to my issue CIO server unhandled exception during network shutdown But now two weeks no feedback.

Have you found any temporary solution for that?
And also, how the sockets are connected back when you turn your connection on?

No, currently there is no way to use ktor websockets in android platform to
prevent crash.
We've temporarily removed websocket from our app as it was used only for
telemetry data.

I had hope that this will be fixed with latest release, but seems it's not
trivial to fix.

On Mon, 5 Aug 2019 at 13:29, Saddam Asmatullayev notifications@github.com
wrote:

Have you found any temporary solution for that?
And also, how the sockets are connected back when you turn your connection
on?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/ktorio/ktor/issues/1237?email_source=notifications&email_token=ABABZHCRZJU2EKZUISCFRL3QC76IDA5CNFSM4IE3THVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3RMY5I#issuecomment-518179957,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABABZHHNST2RY2APKMIIDNTQC76IDANCNFSM4IE3THVA
.

No, currently there is no way to use ktor websockets in android platform to prevent crash. We've temporarily removed websocket from our app as it was used only for telemetry data. I had hope that this will be fixed with latest release, but seems it's not trivial to fix.
…

I think I found something, but I am not sure it is right way, but I can say it does not crash now.

So, before this session.incoming.onReceiveOrNull I tried to check session.incoming.isEmpty, and handled that case. So, now the loop just stops when internet is disconnected.

I think it's not reliable solution, maybe with this fix probability of crash will be lower, but not impossible, as by looking inside the ktor's code there is no way to handle exception if it will be thrown during non blocking socket read.

The okhttp engine throws an error that is able to be caught if anyone is looking for a workaround.

Is okhttp multiplatform?

On Tue, 5 Nov 2019 at 06:24, Luca Spinazzola notifications@github.com
wrote:

The okhttp engine throws an error that is able to be caught if anyone is
looking for a workaround.

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/ktorio/ktor/issues/1237?email_source=notifications&email_token=ABABZHE6ELEPUJWZMRUTVQDQSDYN7A5CNFSM4IE3THVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDBRVEY#issuecomment-549657235,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABABZHDPDZRTBTG4Z7C5LUDQSDYN7ANCNFSM4IE3THVA
.

How about this solution:
Just change Job() to SupervisorJob() at line:
https://github.com/ktorio/ktor/blob/master/ktor-network/jvm/src/io/ktor/network/sockets/NIOSocket.kt#L29
?
I've seen that there are some internal logic which will catch exception and rethrows it into other coroutineContext, so maybe it would be enough to change that Job to SupervisorJob to prevent global exception throw.

This may not be applicable to every application that is running into this issue, but if you want to swallow this IOException so that the app doesn't fatally crash, you can override the thread's default exception handler with custom behavior using Thread.setDefaultUncaughtExceptionHandler() in MainActivity.onCreate().

We are still waiting while ktor team will fix this critical bug which was
introduced in some versions later and never got fixed. This bug cause
crashes not only android but also server side thus making multiplatform
websockets unusable.

On 2020-01-12, Sun at 21:03, David Alan Cohen notifications@github.com
wrote:

This may not be applicable to every application that is running into this
issue, but if you want to swallow this IOException so that the app
doesn't fatally crash, you can override the thread's default exception
handler with custom behavior using
Thread.setDefaultUncaughtExceptionHandler() in MainActivity.onCreate().

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/ktorio/ktor/issues/1237?email_source=notifications&email_token=ABABZHB4WLWQTOXJYYMJIEDQ5OATVA5CNFSM4IE3THVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIXEEXI#issuecomment-573456989,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABABZHFP2WXR3BCG2OYPRVLQ5OATVANCNFSM4IE3THVA
.

Fixed in 1.3.0

I have an issue connected with this case:
Using try-catch block or CoroutineExceptionHandler I cannot handle IOException that fun AReadable.openReadChannel() in io.ktor.network.sockets.Socket.kt throws
Is it possible to handle the exception in the block of code where openReadChannel called?

@Bellkross I am fighting with same issue using ktor Raw Sockets. I see IOException: Connection reset by peer in logcat error log, but I can't catch it. I've tried CoroutineExceptionHandler, SupervisorJob. This exception is just somewhere

Was this page helpful?
0 / 5 - 0 ratings