Our app logs metrics from a wearable (sent via BTLE). Metrics are sent and saved to Realm every second. The Realm is observed to display the metrics back to the user. Other metrics are periodically sent to a backend (and then removed from the Realm).
The app operates for about 5 hours before getting closed due to an OoM. The Realms all seem to be closed correctly (no warnings about about open Realms).
I'm expecting uninterrupted periods of use.
Logcat reveals the following at the point that the app is closed:
11-21 05:15:33.808 27702-28621/com.snuza.picoconnect E/REALM_JNI: jni: ThrowingException 4, std::bad_alloc in /Users/cm/Realm/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1285, .
11-21 05:15:33.809 27702-28655/com.snuza.picoconnect E/REALM_JNI: jni: ThrowingException 4, std::bad_alloc in /Users/cm/Realm/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1285, .
11-21 05:15:33.841 27702-28621/com.snuza.picoconnect E/REALM_JNI: Exception has been throw: std::bad_alloc in /Users/cm/Realm/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1285
11-21 05:15:33.841 27702-28655/com.snuza.picoconnect E/REALM_JNI: Exception has been throw: std::bad_alloc in /Users/cm/Realm/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1285
11-21 05:15:33.855 27702-28621/com.snuza.picoconnect E/REALM_JAVA: io.realm.internal.OutOfMemoryError: std::bad_alloc in /Users/cm/Realm/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1285
at io.realm.internal.TableQuery.nativeBatchUpdateQueries(Native Method)
at io.realm.internal.TableQuery.batchUpdateQueries(TableQuery.java:500)
at io.realm.internal.async.QueryUpdateTask.run(QueryUpdateTask.java:87)
at io.realm.internal.async.BgPriorityRunnable.run(BgPriorityRunnable.java:34)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:422)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:818)
Approximately 5 hours of runtime.
N/A
Realm version(s): 2.2.1
Realm sync feature enabled: no
Android Studio version: 2.3 Canary
Which Android version and device: Android 5.1.1 on the Nexus 5
Is it possible you have any RealmObject/RealmList/RealmResults leaked somewhere? If those are not GCed, the relevant native memory won't be freed.
I will comb through the code for that. As mentioned, I'm using the Observable-pattern (via asObservable) so would only have visibility to them at onNext emissions.
There isn't a case where "live" RealmObjects are emitted, but rather clones of result RealmObjects.
Well, how do you create your copy? I expect that you have unclosed Realm instances on the io() scheduler.
Clones are created by way of a _shallow copy_.
In our implementation, the Realm is never closed (while on the Activity) because we're observing it for RealmObjects being added (to display metric values on the UI) from inserts on the BTLE end.
Is this the incorrect pattern to apply for observable RealmResults?
@mitchwongho Did you call unsubscribe() on the returned Subscription?
When the Activity (or view is deattached) the RealmResult observable is unsubscribed and the Realm is closed (via doOnUnsubscribe(). doOnComplete and doOnError are close the Realm). This it not an issue.
However, we're expecting the screen (Activity/View) to be in the foreground for at least 24 hours and at most 5 days with RealmObject's being inserted at 1/sec. So, the Realm should be open for that duration of operation.
Is this possible with Realm?
Can you share some relevant source code?
Inserting:
try (final Realm realm = Realm.getDefaultInstance()) {
realm.executeTransaction(realm1 -> {
// Value not saved yet
if (statData.getInternalId() == null) {
statData.setInternalId(RealmPrimaryKey.newInstance().toString());
}
// Clone and insert
final PicoStatDataRealm picoStatDataRealm = PicoStatDataRealm.map(statData);
realm1.copyToRealm(picoStatDataRealm);
});
Observing:
final Realm realm = Realm.getDefaultInstance();
return realm.where(PicoStatDataRealm.class)
.equalTo("picoSerialNr", picoSerialNr)
.greaterThan("timeMillis", DateTime.now().getMillis())
.findAllSortedAsync("timeMillis", Sort.DESCENDING) //ensures most recent is first;
.asObservable()
.doOnError(throwable -> realm.close())
.doOnCompleted(realm::close)
.doOnUnsubscribe(realm::close)
.filter(picoStatDataRealms -> !picoStatDataRealms.isEmpty())
.map(picoStatDataRealms -> picoStatDataRealms.first().getClone());
Considering you use try with resources for insertion, that should definitely not be a problem, because that is auto-close.
While I am surprised there is no RealmResults::isLoaded filter on the async results, this in itself doesn't seem to have any problems...
Good tip on RealmResults::isLoaded
Maybe https://realm.io/docs/java/latest/#rxjava is worth a read.
Thanks. I've add filter(RealmResults::isLoaded) and will wait to see if there is an improvement.
I've just start noticing the following before crashing with the OoM exception:
E/REALM_JNI: jni: ThrowingException 7, std::exception in /Users/cm/Realm/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1285, .
btw added filter(RealmResults::isLoaded) doesn't seem to alleviate the crash.
@mitchwongho That's an interesting observation. The exception is thrown when you do an async query, and exception 7 is a version mismatch. Version refers here to the version, each thread/instance sees (Realm is a Multi-Version Concurrency Control database). What kind of thread do you do the write transaction on?
Writes are on the i/o thread: BluetoothGattCallback.onCharacteristicChanged() callback.
Is auto-refresh enabled on your Realm instances?
@kneth well, Schedulers.io() doesn't have a looper to auto-update with
auto-refresh is enabled on AndroidSchedulers.mainThread. Correct me if I'm wrong, auto-refresh is only applicable to the _queries_ and not on _insert_ or _update_ calls, right?
Auto-refresh is applied to Realm instances and their retained versions; and the RealmResults to be up-to-date with their Realm instances.
I created a test app last night to try isolate this aspect of our project Gist
It's been running for 13hours now (94k RealmObjects). It's interesting to see that a simple app like this hits 32MB of RAM.
After 13h20m:
11-24 11:50:20.564 22867-22867/? D/AndroidRuntime: >>>>>> START com.android.internal.os.RuntimeInit uid 2000 <<<<<<
11-24 11:50:20.569 22867-22867/? D/AndroidRuntime: CheckJNI is OFF
11-24 11:50:20.695 22867-22867/? D/AndroidRuntime: Calling main entry com.android.commands.am.Am
11-24 11:50:20.701 832-1688/? I/ActivityManager: Force stopping com.mitchwongho.test.observablerealms appid=10164 user=0: from pid 22867
11-24 11:50:20.701 832-1688/? I/ActivityManager: Killing 2262:com.mitchwongho.test.observablerealms/u0a164 (adj 0): stop com.mitchwongho.test.observablerealms
11-24 11:50:20.728 832-2677/? I/WindowState: WIN DEATH: Window{3fd739cb u0 com.mitchwongho.test.observablerealms/com.mitchwongho.test.observablerealms.MainActivity}
11-24 11:50:20.759 832-1688/? W/ActivityManager: Force removing ActivityRecord{28fff4d5 u0 com.mitchwongho.test.observablerealms/.MainActivity t17660}: app died, no saved state
11-24 11:50:20.760 832-895/? I/UsageStatsService: User[0] Rolling over usage stats
11-24 11:50:20.760 832-895/? I/UsageStatsService: User[0] Flushing usage stats to disk
11-24 11:50:20.770 832-1249/? W/ActivityManager: Spurious death for ProcessRecord{360ebd91 2262:com.mitchwongho.test.observablerealms/u0a164}, curProc for 2262: null
11-24 11:50:20.777 22867-22867/? D/AndroidRuntime: Shutting down VM
@mitchwongho I think you need to call unsubscribe() on every subscription in the subs before line 89 in order to let Realm know the Person object you subscribed to is not needed anymore, then Realm knows the relevant resources can be freed. If you run the code of onResume/onPause in a loop, I guess you can see the leak easier.
@beeender I was wondering, doesn't compositeSubscription.clear() already do that?
Based on the docs, I would expect it to do that.
@Zhuinden hmmm, let me test.
If I remember correctly, you need to call compositeSubscription.unsubscribe() first. clear() just remove any tracked subscriptions active or not.
@Zhuinden @cmelchior I just tested, the clear() does the unsubsribe. (rxJava 1.1.0)
@mitchwongho I created a project to simulate your problem, but I cannot see a leak. Would you please try with it? If you want to modify anything to reproduce your problem, just fork and change it, so i can see and get the source to debug. The project is here. https://github.com/beeender/RealmRxTest
Also Can you share your implementation of Person.getClone()?
It's just a simple shallow copy Gist
public final Person getClone() {
final Person clone = new Person();
clone.id = this.id;
clone.name = this.name;
clone.aBoolean = this.aBoolean;
clone.aByte = this.aByte;
clone.aShort = this.aShort;
clone.aInt = this.aInt;
clone.aFloat = this.aFloat;
clone.aDouble = this.aDouble;
clone.timestamp = this.timestamp;
return clone;
}
Thanks @beeender
@mitchwongho You can use Realm.copyFromRealm() to do the shallow copy. But it doesn't seem to be the problem.
@beeender Thank you test also seem to yield the same behaviour as my test app.
I've managed to take a _heapdump_ just before the OoM. Looks like there are a lot of FinalizerReferences and Long


@mitchwongho That is interesting! You screenshot shows the objects waiting to be freed in the finalizer actually takes a lot of memory. But luckily we are not using the finalizer anymore. See https://github.com/realm/realm-java/pull/3144 . That has not been released yet, but if you want, you can try with our snapshot release.
How long does it take to get OOM with my project? On what kind of device? I can try with it as well.
@beeender The screenshots above if for my app (crashes after 3-4 hours but implements the same pattern for different Realm modals). I only ran your test app for a couple of hours without reaching an OoM. Here is a heapdump of your test app after an hour:

@mitchwongho OK ... Those objects on the heap is actually expected. They are waiting for the native resources get dealloced. However, since we used finalizer thread to do deallocation before, it might trigger an OOM if the finalizer doesn't get time to run (there are locks ... )
Would you please try with our snapshot release with your application to see if the problem still happens. I think #3144 should fix your problem.
2.3.0-SNAPSHOT is applied to my app. Initial observations (after 1 hour) is that there is an big improvement in memory-management (kudos 馃憦馃徑). I'll report back after our 12-hour benchmark and then at 24-hours.
After 8hours I started getting these:
11-28 23:24:48.819 25334-26300/com.snuza.picoconnect E/REALM_JNI: jni: ThrowingException 7, std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282, .
11-28 23:24:48.819 25334-26300/com.snuza.picoconnect E/REALM_JNI: Exception has been throw: std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282
Memory usage is looking tamed.
It looks like then it becomes the same issue with #3802 :(
Is it possible that you can share your project with us? Or if you can make https://github.com/beeender/RealmRxTest close to your existing logic, that will be great. I will try to reproduce it in my side. Thanks a lot!
@beeender The app has been running now for 14 hours. It doesn't appear that the Exception 7 caused any broader issue. It looks like it occurs when a certain event happens in the SyncAdapter (perhaps when attempting to delete _RealmObjects_). I will continue to monitor and report back when I can reproduce.
@beeender The Exception 7 issue may be caused by an asynchronous _query-delete_ bug on my side. I will continue to monitor. As an update, the app has successfully passed the 24-hour operation milestone with crashing and the memory usage appears to be under control.
@beeender I do seem to still experience the Exception 7 issue. Could you tell me if those JNI exceptions get propagated to the Java level? I don't seem to be catching any exceptions/throwable?
Exception 7 is originally thrown by our native code and then in JNI we wrap it as a java exception and throw it. Do you have a full log of the exception?
Can you take some log for us? enable Realm native logs by call RealmLog.setLevel(LogLevel.ALL); after Realm.init(), then get the logcat log by -v thread -v time.
I'm trying to pinpoint where the exception occurs. My log looks like this:
11-30 12:01:56.349 5198-5225/com.snuza.picoconnect D/BluetoothGatt: close()
11-30 12:01:56.349 5198-5225/com.snuza.picoconnect D/BluetoothGatt: unregisterApp() - mClientIf=6
11-30 12:02:01.207 5198-5224/com.snuza.picoconnect D/BluetoothGatt: onClientConnectionState() - status=0 clientIf=5 device=44:D6:E1:00:02:65
11-30 12:02:01.208 5198-5224/com.snuza.picoconnect D/Log: C64100342D GATT - STATE_DISCONNECTED
11-30 12:02:01.208 5198-5224/com.snuza.picoconnect D/BluetoothGatt: close()
11-30 12:02:01.208 5198-5224/com.snuza.picoconnect D/BluetoothGatt: unregisterApp() - mClientIf=5
11-30 12:02:05.268 5198-5374/com.snuza.picoconnect E/REALM_JNI: jni: ThrowingException 7, std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282, .
11-30 12:02:05.268 5198-5374/com.snuza.picoconnect E/REALM_JNI: Exception has been throw: std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282
11-30 12:02:06.184 5198-5390/com.snuza.picoconnect E/REALM_JNI: jni: ThrowingException 7, std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282, .
11-30 12:02:06.184 5198-5390/com.snuza.picoconnect E/REALM_JNI: Exception has been throw: std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282
11-30 12:02:49.898 5198-6634/com.snuza.picoconnect D/Log: setAppServerConnectionState ::set 1
11-30 12:02:49.898 5198-6634/com.snuza.picoconnect D/Log: setAppServerConnectionState() Pico OFFLINE
11-30 12:02:49.898 5198-6634/com.snuza.picoconnect W/Log: PicoControlService::createRealtimeTxWebSocket {deviceId=58170e031800009d3611c691,threadId=17417} CLOSED PICO CONTROL
11-30 12:20:31.084 5198-5198/com.snuza.picoconnect D/SyncService: onCreate()
11-30 12:20:31.097 5198-2066/com.snuza.picoconnect D/Log: SyncAdapter::onPerformSync - START
With Realm logging set to ALL:
11-30 12:26:55.992 9791-9791/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeIsClosed -1571913520
11-30 12:26:55.992 9791-9791/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeIsClosed -1571913520
11-30 12:26:55.992 9791-10049/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeCloseSharedRealm -1362919088
11-30 12:26:55.992 9791-11372/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeIsClosed -1571910936
11-30 12:26:55.992 9791-11372/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeCommitTransaction -1571910936
11-30 12:26:55.999 9791-10035/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeCloseConfig -1360611280
11-30 12:26:55.999 9791-10035/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_TableQuery_nativeBatchUpdateQueries
11-30 12:26:55.999 9791-10035/com.snuza.picoconnect E/REALM_JNI: jni: ThrowingException 7, std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282, .
11-30 12:26:55.999 9791-10035/com.snuza.picoconnect E/REALM_JNI: Exception has been throw: std::exception in /tmp/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_TableQuery.cpp line 1282
11-30 12:26:55.999 9791-10035/com.snuza.picoconnect D/REALM_JAVA: Query update task could not complete due to a BadVersionException. Retry is scheduled by a REALM_CHANGED event.
11-30 12:26:55.999 9791-10035/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeCloseSharedRealm -1362919152
11-30 12:26:56.047 9791-11372/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeGetSnapshotVersion -1571910936
11-30 12:26:56.047 9791-11372/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_SharedRealm_nativeCloseSharedRealm -1571910936
11-30 12:26:56.047 9791-9791/com.snuza.picoconnect D/REALM_JAVA: REALM_CHANGED : io.realm.HandlerController@1c289e57
11-30 12:26:56.048 9791-9791/com.snuza.picoconnect V/REALM_JAVA: REALM_CHANGED realm: io.realm.HandlerController@1c289e57 updating async queries, total: 10
11-30 12:26:56.048 9791-9791/com.snuza.picoconnect V/REALM_JNI: --> Java_io_realm_internal_TableQuery_nativeHandoverQuery -1263060352
@mitchwongho Hmmm, OK, this one is thrown in our background thread to update the async queries, and it is not harmful for your app as far as i can tell (it just print some log and catch the exception in our code.). After this happens, Realm will try to update the async query later. So, do you see any abnormal behavior after this log printed?
I'm investigating if there is a correlation between the Realm exception on an IO timeout event on our applicable level.
I realise that this discussion is now outside the scope of the initial ticket raised, so I'm happy that this ticket be closed and look forward for the fix to be promoted to the Release build. Thank you.
Thanks a lot for the valuable feedback!! I will just close this issue. The #3144 will be included in the next Realm release soon. Please let us know if there are more things we can help.
Most helpful comment
2.3.0-SNAPSHOTis applied to my app. Initial observations (after 1 hour) is that there is an big improvement in memory-management (kudos 馃憦馃徑). I'll report back after our 12-hour benchmark and then at 24-hours.