Flipper: [Android] Native crash (SIGSEGV) on app startup

Created on 30 Jun 2018  Â·  30Comments  Â·  Source: facebook/flipper

I tried to integrate sonar into the app that I work on. On some phones with version 0.0.5 and 0.0.8, I noticed a crash in the native part of the code right on the app startup. It's so quick that I'm not able to even get to the main activity or any point where I can attach the debugger.

Unfortunately, this does not reproduce 100% of the times on all phones. When you hit this, the only available option for me was to disable sonar. Uninstalling and reinstalling the app did not help. I was able to reproduce it on a Nexus 6P and a Google Pixel 2.

I'm attaching the stack trace below, I am not sure if it's entirely helpful or if it points to the actual problem or not. Let me know if there is any debug option I can turn on to provide more useful logs.

Thank you.

06-29 16:33:05.376 22827-22851/? A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0x16 in tid 22851 (lix.mediaclient), pid 22827 (lix.mediaclient)
06-29 16:33:05.440 22857-22857/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
   Build fingerprint: ‘google/walleye/walleye:8.1.0/OPM2.171026.006.C1/4769658:user/release-keys’
   Revision: ‘MP1’
   ABI: ‘arm’
   pid: 22827, tid: 22851, name: lix.mediaclient  >>> com.netflix.mediaclient <<<
   signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x16
   Cause: null pointer dereference
       r0 00000002  r1 d118de64  r2 d118debc  r3 00000000
       r4 00000002  r5 d118debc  r6 00002710  r7 d118df18
       r8 d118e010  r9 ed1331b0  sl 00000006  fp d118e0c8
       ip d147d724  sp d118de50  lr d14f0fd8  pc d14f0d40  cpsr 600e0010
06-29 16:33:05.441 22857-22857/? A/DEBUG: backtrace:
       #00 pc 0002fd40  /data/app/com.netflix.mediaclient-FYX3_ueYVM2-PK2yN2gG3g==/lib/arm/libssl.so (__gnu_Unwind_Resume+8)
       #01 pc 0002ffd4  /data/app/com.netflix.mediaclient-FYX3_ueYVM2-PK2yN2gG3g==/lib/arm/libssl.so (___Unwind_Resume+20)
06-29 16:33:06.477 873-873/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_00
06-29 16:33:06.495 1144-1184/? W/zygote64: kill(-20841, 9) failed: No such process
bug

Most helpful comment

I still get this on 0.9.0+

All 30 comments

The crash seems to occur in libssl.so. We're currently working on replacing that for a newer version from either BoringSSL or OpenSSL 1.1.0. Not sure why this is happening, but I guess there's a reasonable chance that the update will fix it.

@passy, thank you, I'll give it another shot after that update.

I have encountered an Issue that the app will crash if not launch the Sonar Desktop app.

07-03 14:19:19.036 22874-22934/com.photo.staging A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0x16 in tid 22934 (m.xxx.staging), pid 22874 (m.xxx.staging)

Could you try this with 0.0.9? I updated OpenSSL to a vanilla 1.1.0h version. If there are still native crashes, I wonder if this could be due to zygote having already OpenSSL symbols mapped into the process space. Perhaps we would need to link statically in that case, but I'd like to hear if this is still happening first.

@passy, I upgraded our dependency to 0.0.9, and it seems to be working fine on both Nexus 6P and Pixel 2. We don't see any native crashes on startup. The crash was not a 100% repro on the 6P, so I'll keep an eye out if it re-appears.

Thank you so much for updating OpenSSL -- seems to have done the trick!

Happy to hear that. Let me know if you see it again. I've also pushed out 0.10.0 with support for x86_64 emulators.

Since the issue is solved. Closing it. Feel free to reopen if the issue pops up.

Unfortunately, this new library is causing a crash on startup on a Sony tablet (SGP511) with Android 5.1.1:

07-05 11:30:43.945 1877-1943/com.netflix.mediaclient E/art: dlopen("/data/app/com.netflix.mediaclient-2/lib/arm/libsonar.so", RTLD_LAZY) failed: dlopen failed: cannot locate symbol "OPENSSL_sk_num" referenced by "libfolly.so"...
07-05 11:30:43.945 1877-1943/com.netflix.mediaclient E/SoLoader: couldn't find DSO to load: libsonar.so caused by: dlopen failed: cannot locate symbol "OPENSSL_sk_num" referenced by "libfolly.so"...
07-05 11:30:43.960 1877-1943/com.netflix.mediaclient E/AndroidRuntime: couldn't find DSO to load: libsonar.so caused by: dlopen failed: cannot locate symbol "OPENSSL_sk_num" referenced by "libfolly.so"...
    java.lang.UnsatisfiedLinkError: couldn't find DSO to load: libsonar.so caused by: dlopen failed: cannot locate symbol "OPENSSL_sk_num" referenced by "libfolly.so"...
        at com.facebook.soloader.SoLoader.doLoadLibraryBySoName(SoLoader.java:681)
        at com.facebook.soloader.SoLoader.loadLibraryBySoName(SoLoader.java:547)
        at com.facebook.soloader.SoLoader.loadLibrary(SoLoader.java:484)
        at com.facebook.soloader.SoLoader.loadLibrary(SoLoader.java:444)
        at com.facebook.sonar.android.EventBase.<clinit>(EventBase.java:19)
        at com.facebook.sonar.android.SonarThread.run(SonarThread.java:25)
07-05 11:30:59.004 1877-1943/com.netflix.mediaclient E/AndroidRuntime: FATAL EXCEPTION: SonarEventBaseThread
    Process: com.netflix.mediaclient, PID: 1877
    java.lang.UnsatisfiedLinkError: couldn't find DSO to load: libsonar.so caused by: dlopen failed: cannot locate symbol "OPENSSL_sk_num" referenced by "libfolly.so"...
        at com.facebook.soloader.SoLoader.doLoadLibraryBySoName(SoLoader.java:681)
        at com.facebook.soloader.SoLoader.loadLibraryBySoName(SoLoader.java:547)
        at com.facebook.soloader.SoLoader.loadLibrary(SoLoader.java:484)
        at com.facebook.soloader.SoLoader.loadLibrary(SoLoader.java:444)
        at com.facebook.sonar.android.EventBase.<clinit>(EventBase.java:19)
        at com.facebook.sonar.android.SonarThread.run(SonarThread.java:25)

Is this library compatible only with a certain minSdk?

@rohandhruva Could you include some events from before it gets to this? I'm curious if there's a mention of which libssl.so it tries to load for those symbols. All the symbols referenced there are present in the armeabi variant of libssl.so that's included in the AAR.

@passy here's the entire log I was able to capture: https://gist.github.com/rohandhruva/4a8fc165df939d65db34066edd4d174a

Is this more useful? Thanks.

@rohandhruva It is, thank you.

I'm a bit surprised to see no mention of libssl.so or libcrypto.so in that output. Just to rule this out, are you bundling your own version of that by any chance?

Otherwise, the "unused DT entry" about the VERNEEDNUM entry in the .sos could cause some trouble. We could try to run the libraries for ARM through an "elf cleaner" like this one here: https://github.com/kost/android-elf-cleaner

I'm currently on holidays, but I can look into that last week.

I think we do include a version of libssl in our app, I see some references to it, but I'm not familiar enough with native development to check if that's what's causing this crash.

Multiple versions will definitely cause trouble. I think the only way to
work around that would be to statically link in at least one of the
libraries. For Sonar that's a bit cumbersome because multiple libraries
depend on OpenSSL, but it might be worth a try.

>

@rohandhruva I had a look at the publicly available com.netflix.mediaclient APK and there are indeed references to OpenSSL. specifically in libmdx_jni.so. I've got a superficial knowledge of native code at best, so I'm not sure if this means that it contains some statically linked OpenSSL code or not, but it doesn't reference libssl or libcrypto for that matter. The version embedded is 1.0.1t from May 2016, so it'll definitely be incompatible with our 1.1.0h.

There's also a reference to /Users/vgondi/, so maybe they know more. :D

I think it makes sense from our end to investigate static linking too, as this scenario is likely not unique to you. I can't give you an exact timeline, but I'll try to look into that soon.

@rohandhruva I put out 0.6.11 (big jump, just to catch up with the top-level release version) which is statically compiled. Give it a whirl, when you get a chance. Would be curious to know if this makes a difference. :)

I had a look at the publicly available com.netflix.mediaclient APK and there are indeed references to OpenSSL. specifically in libmdx_jni.so.

I'm fairly new to the app myself, but that sounds about right, I can try to find out more.

There's also a reference to /Users/vgondi/, so maybe they know more. :D

Heh, I'll definitely try to find out more from him :D

I put out 0.6.11 ... Would be curious to know if this makes a difference. :)

Unfortunately, it didn't: the app instacrashes on startup, although with a different log this time. I'm not entirely sure where the exact crash is, but here's the logcat output:
https://gist.github.com/rohandhruva/b8f5a933d560ae1aae6dbfaf655905aa

Thanks for your patience!
Is that on the 5.1.1 device again? I still haven't tried stripping all the
.so's of the unsupported symbols.

Is this crashing on other architectures/SDK levels too?

On Wed, 11 Jul 2018, 18:32 Rohan Dhruva, notifications@github.com wrote:

I had a look at the publicly available com.netflix.mediaclient APK and
there are indeed references to OpenSSL. specifically in libmdx_jni.so.

I'm fairly new to the app myself, but that sounds about right, I can try
to find out more.

There's also a reference to /Users/vgondi/, so maybe they know more. :D

Heh, I'll definitely try to find out more from him :D

I put out 0.6.11 ... Would be curious to know if this makes a difference.
:)

Unfortunately, it didn't: the app instacrashes on startup, although with a
different log this time. I'm not entirely sure where the exact crash is,
but here's the logcat output:
https://gist.github.com/rohandhruva/b8f5a933d560ae1aae6dbfaf655905aa

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/facebook/Sonar/issues/120#issuecomment-404250462, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAmsmce1i6wUMtN8k260qX6qXBT3oj8ks5uFjasgaJpZM4U-G2t
.

>

timezone: eu/london | utc+1
@passy

Thank you for helping debug this issue!

Yes, this new crash is still only on the Sony SGP511 tablet with Android 5.1.1 (and sonar 0.6.11). I tried the new sonar version on a Nexus 6P with Android 8.1, and it's working fine (as it did on 0.0.9 as well).

Okay, at least it's not a regression then. :)

Even though it's only a warning, based on the StackOveflow answer, I still
have some hope this is the linker throwing its hands in the air. I'll try
manually stripping all the libraries tomorrow.

On Wed, 11 Jul 2018, 19:35 Rohan Dhruva, notifications@github.com wrote:

Thank you for helping debug this issue!

Yes, this new crash is still only on the Sony SGP511 tablet with Android
5.1.1 (and sonar 0.6.11). I tried the new sonar version on a Nexus 6P with
Android 8.1, and it's working fine (as it did on 0.0.9 as well).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/facebook/Sonar/issues/120#issuecomment-404268863, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAmslDD9WyGBfvbffHPqwCORphUOUMLks5uFkVtgaJpZM4U-G2t
.

>

timezone: eu/london | utc+1
@passy

Couldn't let it rest.

Can you give this a try? https://github.com/facebook/Sonar/releases/download/v0.6.11/sonar-0.6.11-repack.aar

If you use gradle, you should be able to place this somewhere, add a flatdir as repository like

repositories {
    flatDir {
        dirs 'libs'
    }
}

and ensure that you explicitly depend on fbjni because you lose the dependency resolution now:

dependencies {
    implementation 'com.facebook.sonar:fbjni:0.6.11'
    implementation(name: 'sonar-0.6.11-repack', ext: 'aar')
}

Thank you for whipping up the AAR! It's still crashing on that tablet, unfortunately: https://gist.github.com/rohandhruva/58af60cc7557fec0962c3e3f6e3a7edb

Looks like this might be the relevant line:

07-11 13:00:26.100 32246-32307/com.netflix.mediaclient A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0xa4 in tid 32307 (SonarConnection)

I got myself a 5.1.1 armv7 tablet to repro this and managed to get a native crash too using 0.6.11:

https://gist.github.com/passy/9a46486d1bf5a0b5ad419f8aeb2af5ad

Logs are a bit noisy but at least have a proper dump of the stack.

Great, thank you for trying to repro! Interestingly, both our logs show a SIGSEGV at the same fault address:

F/libc ( 5490): Fatal signal 11 (SIGSEGV), code 1, fault addr 0xa4 in tid 5507 (SonarConnection)

com.netflix.mediaclient A/libc: Fatal signal 11 (SIGSEGV), code 1, fault addr 0xa4 in tid 32391 (SonarConnection)

I don't see anything more indicative of what caused the segfault, which is about the extent of my knowledge of native :)

I've got a repro case in an open source project if you're interested

@hzsweers Yes please. :)

https://github.com/hzsweers/CatchUp (should be able to run with ./gradlew :app:installDebug). I get this on launch if I run it on an x86_64 emulator. My specific emulator was a Pixel 2 SDK 28 with play store. Repro'd on 0.6.16 . and 0.6.17 at least

I get error A/libc: Fatal signal 11 (SIGSEGV) at 0x000000a4 (code=1), thread 4478 (SonarConnection) on Emulator with API 19, library v0.6.18.

it happened on 5.x with 0.6.12 too
Fatal signal 11 (SIGSEGV), code 1, fault addr 0xa4 in tid 23929 (SonarConnection)

This should no longer happen with the version on master. If it does with the next release, please comment and we can re-open this.

I still get this on 0.9.0+

Was this page helpful?
0 / 5 - 0 ratings