We are using Realm (3.3.1) in our Android app and now and then (after 2 - 3 minutes of usage) we get native crashes like this:
05-31 10:00:25.872 12824-12824/? A/libc: Fatal signal 11 (SIGSEGV), code 2, fault addr 0xa15f18a4 in tid 12824 (com.timetac)
05-31 10:00:25.924 276-276/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
05-31 10:00:25.924 276-276/? A/DEBUG: Build fingerprint: 'google/razor/flo:6.0.1/MOB30X/3036618:user/release-keys'
05-31 10:00:25.924 276-276/? A/DEBUG: Revision: '0'
05-31 10:00:25.924 276-276/? A/DEBUG: ABI: 'arm'
05-31 10:00:25.925 276-276/? A/DEBUG: pid: 12824, tid: 12824, name: com.timetac >>> com.timetac <<<
05-31 10:00:25.925 276-276/? A/DEBUG: signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0xa15f18a4
05-31 10:00:25.951 276-276/? A/DEBUG: r0 a15f18a4 r1 00000076 r2 b4c5d850 r3 a1484c1a
05-31 10:00:25.951 276-276/? A/DEBUG: r4 a16c3650 r5 9b55e850 r6 a15f18a0 r7 a16c3650
05-31 10:00:25.952 276-276/? A/DEBUG: r8 ffffffff r9 b4cf6500 sl 9b55e850 fp ffffffff
05-31 10:00:25.952 276-276/? A/DEBUG: ip 00000000 sp bed926e0 lr b692ca7b pc b6902118 cpsr 800f0030
05-31 10:00:25.956 276-276/? A/DEBUG: backtrace:
05-31 10:00:25.956 276-276/? A/DEBUG: #00 pc 000dc118 /system/lib/libskia.so
05-31 10:00:25.956 276-276/? A/DEBUG: #01 pc 00106a77 /system/lib/libskia.so (_ZN7SkPaintaSERKS_+24)
05-31 10:00:25.956 276-276/? A/DEBUG: #02 pc 00094523 /system/lib/libandroid_runtime.so (_ZN7android5PaintaSERKS0_+6)
05-31 10:00:25.956 276-276/? A/DEBUG: #03 pc 71e9499b /data/dalvik-cache/arm/system@[email protected] (offset 0x1ec9000)
05-31 10:00:27.724 276-276/? A/DEBUG: Tombstone written to: /data/tombstones/tombstone_00
The tombstone file then contains the following, which points to realm as the reason for the crash:
...
memory map: (fault address prefixed with --->)
...
a1347000-a1449fff rw- 0 103000 [stack:12951]
a144a000-a15effff r-x 0 1a6000 /data/app/com.timetac-1/lib/arm/librealm-jni.so (BuildId: f7c35cff0e56460f1b94308337b4f3dbf9ee9c60)
a15f0000-a15f0fff --- 0 1000
--->a15f1000-a15f8fff r-- 1a6000 8000 /data/app/com.timetac-1/lib/arm/librealm-jni.so
a15f9000-a15f9fff rw- 1ae000 1000 /data/app/com.timetac-1/lib/arm/librealm-jni.so
a15fa000-a15fffff rw- 0 6000
a1600000-a167ffff rw- 0 80000 [anon:libc_malloc]
...
The crashes only occur on real devices with Android Marshmallow (tested on a Samsung Galaxy S6 and a Nexus 7 2nd generation). It never occured on Emulator or on devices with Android Nougat.
We are not using the Realm Sync.
Any help would be appreciated.
tombstone_00.txt
Hi, the crash happened in tid 12824 which doesn't seem to be related with realm. It looks like something wrong with libskia (the Android graphic lib)
Realm has some daemon threads running, that is the reason you can find realm in the tombstone.
Maybe there is some problem with those devices's ROMs?
[...]we literally spent hundreds of hours debugging a native crash in libskia (called by system) - somehow a pointer to a Paint object contained garbage instead of an address, happening only once or twice an hour. Instantly Realm was blamed, but we could not find any evidence and our Issue was (rightfully) closed after a few weeks, since the invalid value of that pointer only sometimes was inside Realm's memory. After resorting to try-and-error we found out that the issue was caused by a Realm query that was under specific circumstances executed about 50 times at once. This entire thing happened only on Marshmallow and not on the emulator.
from https://www.reddit.com/r/androiddev/comments/6nsc9l/weekly_questions_thread_july_17_2017/dkeeat5/
After resorting to try-and-error we found out that the issue was caused by a Realm query that was under specific circumstances executed about 50 times at once.
I can still argue that the problem is in libskia side even it can be reproduced by running Realm query 50 times, since libskia and Realm share the same heap in the same process, and the bad pointer was used by libskia. I am not quite sure how to define the statement that pointer only sometimes was inside Realm's memory. Was the memory alloced by Realm the last time? Then wouldn't it mean libskia is using a freed pointer?
But sure, it still could be memory issue in Realm side. If it happens more often, worth to investigate.
If anyone ever stumbles upon this... @beeender actually figured it out in another thread (which I won't be able to find), but there is an Android bug where finalizer sets a Matrix to 0 as native ptr, so if RealmChangeListener is called after onDestroy or onDestroyView, then Android could have finalized the matrix it uses to display views that are already destroyed but still exist in Java.
So the solution is to remove RealmChangeListeners in onDestroy. The reason why they got the crash "after doing the same query multiple times" is because it eventually triggered GC and caused the finalizer to null out the matrix.
Most helpful comment
from https://www.reddit.com/r/androiddev/comments/6nsc9l/weekly_questions_thread_july_17_2017/dkeeat5/