Ktor: New crash in 1.3.1 on iOS in ReleaseHeapRefStrict

Created on 14 Feb 2020  ·  17Comments  ·  Source: ktorio/ktor

Ktor Version and Engine Used (client or server and name)
Common client (HttpClient { … } so default engine) on iOS (arm64) version 1.3.1

Describe the bug
Upgrading from 1.3.0 to 1.3.1 introduces a new fatal error which crashes the app, with very little details.

To Reproduce
I don't have a reproducer, but the error really looks like #1380. It somehow happens 100% of the time when my app starts, but not on the first request. Weird.

Screenshots

Main thread when the crash occurs:
image
Associated line of code: JSON_INDENTED.stringify(JsonElementSerializer, Json.plain.parseJson(maybeJson))
JSON_INDENTED is Json(JsonConfiguration(prettyPrint = true, indent = " "))
maybeJson is a String representation of the received body, which I'm about to log. It's a custom request Logging feature. Note that I did not change the version of kotlinx.serialization

Thread inside which the crash occurred:
image
You can see references to ReleaseHeapRefStrict like in #1380.

I also see these logs in 1.3.0, I don't see them in 1.3.1 and I feel like it crashes before they appear

2020-02-14 15:25:55.352209+0100 MyTarget[4159:1595620] [] tcp_input [C44.1:3] flags=[R] seq=295735571, ack=0, win=0 state=LAST_ACK rcv_nxt=295735571, snd_una=390992163
2020-02-14 15:25:55.354843+0100 MyTarget[4159:1595620] [] tcp_input [C44.1:3] flags=[R] seq=295735571, ack=0, win=0 state=CLOSED rcv_nxt=295735571, snd_una=390992163
bug

Most helpful comment

The bug reproduced, and the fix is mostly done. Sorry for the delay.

All 17 comments

Cc @sandwwraith

Cc @olonho any ideas?

Not enough info, crash is in very generic GC code, we need reproducer.

I really don't think I can have a reproducer for this. Could this be caused by #1480? I'm trying to find what change could cause this to know where to look. I have both a custom Logging and a custom Auth Ktor client features and maybe I'm doing something wrong there. That's my only guess for now

When I'm saying something wrong, I mean something that "used" the fact that something leaked to work

Ok so I think I found when it occurs. Here is a snippet from my custom LogFeature:
````kotlin
override fun install(feature: LogFeature, scope: HttpClient) {

        // ... request logger setup, error response logger setup

        // Success response
        ResponseObserver.install(ResponseObserver { response ->
            // Intercepts on receivePipeline.Before
            if (response.status.isSuccess()) {
                HttpLogger.logResponse(response)
            }
        }, scope)
    }

I tried to reproduce again and again and while it still occurs quite randomly, it always occurs in the middle of the logging of a success response, more precisely always after the response headers have been successfully logged and before the body is logged. Here is the matching code in `HttpLogger` used above: kotlin
suspend fun logResponse(response: HttpResponse) {
val icon = if (response.status.isSuccess()) '✅' else '❌'
val requestId = response.call.getRequestId()
val statusCode = response.status.value
val url = response.call.request.url.toString()
log.info("🌎🔽$icon $requestId $statusCode $url")
val headers = headersToString(response.headers.entries())
if (headers != null) {
// This is always the last log before the crash
log.debug("Headers:\n$headers")
}
val body = responseBodyToString(response.contentType(), response.content)
if (body != null) {
// This log doesn't occur while there definitely is a body to be logged,
// which works most of the time
log.debug("Body:\n$body")
}
@Suppress("deprecation")
Bugsnag.breadcrumb(
"HTTP Response",
mapOf(
"requestId" to requestId,
"verb" to response.call.request.method.value,
"statusCode" to statusCode.toString(),
"headers" to headers,
"body" to body?.ellipsize()
),
REQUEST
)
}
````
Note: yes, I have emojis in my logs

So my guess is that it crashes at some point here:
````kotlin

@Suppress("EXPERIMENTAL_API_USAGE")
private val JSON_INDENTED = Json(JsonConfiguration(prettyPrint = true, indent = "  "))

private val passwordObfuscatorRegex = "\"password\" ?: ?\".*?(?<!\\\\)\"".toRegex()

// ... logResponse shown above defined here, among other things

internal suspend fun responseBodyToString(contentType: ContentType?, body: ByteReadChannel): String? =
    body
        .readText(contentType?.charset() ?: Charsets.UTF_8)
        .let { if (it.isBlank()) null else tryPrettyJson(obfuscate(it)) }

@Suppress("EXPERIMENTAL_API_USAGE")
private fun tryPrettyJson(maybeJson: String): String =
    try {
        JSON_INDENTED.stringify(JsonElementSerializer, Json.plain.parseJson(maybeJson))
    } catch (error: Throwable) {
        maybeJson
    }

private fun obfuscate(raw: String): String =
    raw.replace(passwordObfuscatorRegex, """"password":"************"""")

````

Again, this code works 100% of the time with Ktor 1.3.0, and always ends up crashing the app with Ktor 1.3.1 (after a few to dozens of requests).

So I had to stop using ResponseObserver because it started to deadlock again (it already did that once a very long time ago, before Ktor 1.0 I think). Having just ResponseObserver.install(ResponseObserver { }, scope) would prevent my get from ever returning. But that's another weird bug that I don't know much about.

So I changed
````kotlin
// Error response
scope.responsePipeline.insertPhaseBefore(
scope.responsePipeline.items.first(),
phaseResponseFirst
)
@Suppress("EXPERIMENTAL_API_USAGE")
scope.responsePipeline.intercept(phaseResponseFirst) {
try {
proceed()
} catch (e: ResponseException) {
val (contentForLogging, contentForError) = e.response.content.split(scope)
HttpLogger.logResponse(context.wrapWithContent(contentForLogging).response)
throw BadResponseException(e)
.withRequestInfo(e.response.call.request)
.withResponseInfo(context.wrapWithContent(contentForError).response)
}
}

        // Success response
        ResponseObserver.install(ResponseObserver { response ->
            // Intercepts on receivePipeline.Before
            if (response.status.isSuccess()) {
                HttpLogger.logResponse(response)
            }
        }, scope)

to kotlin
// Response
scope.responsePipeline.insertPhaseBefore(
scope.responsePipeline.items.first(),
phaseResponseFirst
)
@Suppress("EXPERIMENTAL_API_USAGE")
scope.responsePipeline.intercept(phaseResponseFirst) {
try {
proceed()
HttpLogger.logResponse(context.response)
} catch (e: ResponseException) {
val (contentForLogging, contentForError) = e.response.content.split(scope)
HttpLogger.logResponse(context.wrapWithContent(contentForLogging).response)
throw BadResponseException(e)
.withRequestInfo(e.response.call.request)
.withResponseInfo(context.wrapWithContent(contentForError).response)
}
}
````

And now it looks like I'm back at this:
image
With this on another thread as usual:
image
Happens after a few dozens requests.

I wasn't able to find the cause of the issue and I didn't have more time to spend on this, so I worked around it by replacing this line
kotlin JSON_INDENTED.stringify(JsonElementSerializer, Json.plain.parseJson(maybeJson))
with a custom algorithm which prettifies a raw json String.

EDIT: Well it still crashes from time to time while unrelated code is running. It's always when GC runs basically. I rolled back to 1.3.0 and there is no problem. Hopefully I won't stay stuck on this version forever...

So here is what I know:

  • Never occurs with Ktor 1.3.0
  • Occurs 100% of the time with Ktor 1.3.1, after some time / requests (a few minutes usually)
  • Occurs quickly on an actual device (after like 10-30 requests) while taking a very long time to occur in a simulator (like 10min and hundreds of requests), maybe because of the huge difference on the amount of RAM available
  • Thread 16: EXC_BAD_ACCESS (code=1, address=0x38)

Things I'm not sure about:

  • At first it seemed like it was occurring while logging the response I received for the request (see messages above), but I realized it was the moment I allocated the biggest amount of RAM in a short time (I think). Replacing this code which was allocating a lot of RAM (see previous message) made the error a little more rare, but it still occurs with different stacktraces in the main thread.

EDIT: Another stack trace
image
image
This one occurred on device (iPad 6) after 27min of run.

EDIT: Another one
image
And the thread the crash occurs in has the same stacktrace every time it seems

So I had to stop using ResponseObserver because it started to deadlock again (it already did that once a very long time ago, before Ktor 1.0 I think).

I have been facing the same issue. I was about to raise an issue and then I saw this. For my case this is actually on server with ktor-client-core-jvm version 1.3.0 and 1.3.1 .

I'm now using the ResponseObserver again. I don't know what makes it deadlock sometimes...

Just to update on the ResponseObserver choking, if you are reading content, do it in a new coroutine, then it doesn't choke at all.

Well the Ktor Logging feature doesn't do that and I couldn't get it to choke the same way when I tried to make a reproducer for the main issue

Now that Kotlin 1.3.70 is out, this issue starts to be annoying. Unless Ktor 1.3.0 can work with Kotlin 1.3.70, I'm stuck on Kotlin 1.3.61 until this issue is resolved.

It will be really annoying when Kotlin 1.4 appears.

Maybe an issue should be created on YouTrack to help setting priorities on JetBrains end?

The bug reproduced, and the fix is mostly done. Sorry for the delay.

The workaround is available in 1.3.2. The proper fix will be possible after the kotlin-native update.

Was this page helpful?
0 / 5 - 0 ratings