I have an Android app I am developing. I have found an instance in my code base where Realm is a bit flaky.
In my app, I am creating a chain of realm objects.
Model -> Model -> Model -> Model -> Model
Model -> Model
Model -> Model -> Model -> Model -> Model -> Model -> Model
Model
to work as a job queue for myself. The 4 models on the very left are the parents and children are linked to them. If you are a parent, you need to run and succeed. Once you succeed, you get deleted and your child now becomes the parent.
This is where Realm has been flaky. I will create a link in a model. I download a copy of my realm DB from my rooted Android phone and view it in the Realm macOS browser app. Everything looks good.
Model -> Model
I may create one or more children.
Model -> Model -> Model
Download a new copy from my Android phone, view it, everything looks good.
I may even create some new parents. Then some new children. It doesn't really matter. This is where it becomes interesting.
I may have a line of jobs like this:
Model -> Model -> Model -> Model -> Model
Model -> Model
2 parents with some children. Everything may be good at this situation. Everything links nicely. Then, after I decide to add a new parent or new child, one of these links may go nil (indicated by -X- in diagram below):
Model -> Model -> Model -X- Model -> Model
Model -> Model -> Model
Yeah, it's a link made in the past. Not even the link I am trying to create now.
This is a gist of what my code looks like:
realm.executeTransaction(Realm.Transaction { realm ->
val lastChildJobFound = getLastChildForParent(realm, parent_id) // runs realm query to get managed child object
if (lastChildJobFound == null) { // no child exists. Create new parent.
realm.copyToRealm(NewJobModel(unique_primary_key, parent_id))
} else { // child found. Link to child.
val newChildJob = realm.copyToRealm(NewJobModel(unique_primary_key, parent_id)) // first, create new child, now I am going to link below.
lastChildJobFound.linkNewChild(newChildJob)
}
})
As you can see, we are running this block of code inside of a Realm transaction to perform writes. I am not using realm.copyToRealmOrUpdate() in this block of code because I thought that when you are dealing with realm managed objects and you update their values on the object inside of a realm transaction block, realm would see those changes on the object and update the values for me.
I have clean installed my app using this code ~8 times and I can see it randomly make a link nil as I was describing above ~7 of those times.
Here is how I am fixing the issue.
realm.executeTransaction(Realm.Transaction { realm ->
val lastChildJobFound = getLastChildForParent(realm, parent_id) // runs realm query to get managed child object
if (lastChildJobFound == null) { // no child exists. Create new parent.
realm.copyToRealm(NewJobModel(unique_primary_key, parent_id))
} else { // child found. Link to child.
val newChildJob = realm.copyToRealm(NewJobModel(unique_primary_key, parent_id)) // first, create new child, now I am going to link below.
lastChildJobFound.linkNewChild(newChildJob)
// Here to fix issue where Realm was flaky.
realm.copyToRealmOrUpdate(lastChildJobFound)
realm.copyToRealmOrUpdate(newChildJob)
}
})
If I add realm.copyToRealmOrUpdate() for my managed objects after I have updated their values, the flaky behavior goes away. Through all of my testing so far, I have not had any links turn into nil. I have ran this code cleanly ~5 times so far making double the amount of parents and children to see if there was a way to break it and so far, there has not been.
This may/may not be a bug with Realm. However, I thought from the Realm docs that once an object was managed by Realm it would update for me and I didn't have to call realm.copyToRealmOrUpdate().
Is this a bug? Is this a known behavior that I do indeed to include realm.copyToRealmOrUpdate() in my code whenever I update an object?
Realm version(s): 2.2.1
Realm sync feature enabled: no
Android Studio version: 2.2.3
Which Android version and device: Android 6.0.1
So
// Here to fix issue where Realm was flaky.
realm.copyToRealmOrUpdate(lastChildJobFound)
realm.copyToRealmOrUpdate(newChildJob)
Those two lines solved the problem? Do you need both of them? How does getLastChildForParent(realm, parent_id) look like?
After more testing on my part, it has not solved the problem 100%. It is
more stable, but about 5 - 10% of the write transactions, it fails. I am
still getting links removed.
getLastChildForParent():
arrayOf(FooModel.javaClass, BarModel.javaClass).forEach {
realm.where(it).findAllSorted("created_at", Sort.DESCENDING)
}
It finds the most recently created child for you.
On Sat, Jan 14, 2017 at 1:14 AM, Chen Mulong notifications@github.com
wrote:
So
// Here to fix issue where Realm was flaky.
realm.copyToRealmOrUpdate(lastChildJobFound)
realm.copyToRealmOrUpdate(newChildJob)Those two lines solved the problem? Do you need both of them? How does getLastChildForParent(realm,
parent_id) look like?—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/realm/realm-java/issues/4045#issuecomment-272607468,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB8k-uyJ_kr3kCf1-WxVuPjAsT4-Iag6ks5rSHXTgaJpZM4Ljd14
.
I have been running more experiments this afternoon.
I have been trying the following now:
Instead of running realm. executeTransaction(), I am running realm. executeTransactionAsync() as it has a callback when it is successful. This ties back to this issue I created dealing with Realm not refreshing after a write transaction. I am trying to use executeTransactionAsync() to see if it improves behavior more then executeTransaction() has.
I commented out the 2 realm. copyToRealmOrUpdate() calls. These 2 lines were previously known as my fix, but after commenting them out, it seems to be working better now? I do not understand why this makes a difference either way if I have this commented out or not but it makes a big difference it seems.
I want to mention that I am using Kotlin generics quite a bit with my realm operations. My queries and writes are all done using generics. Perhaps realm doesn't work very well with generics? I am getting a feeling this is my issue. This issue of being flaky came after I refactored my code to start using generics.
@levibostian It is more like a timing problem then. Is it possible for you to make a minimal project to demo the problem? It would be much easier to figure out the project with a completed code logic. Also, please notice one thing:
| thread A | Thread B |
| -- | :--: |
| change the db | |
| | Get db changed notification |
| | Call change listeners |
You have to rely on the RealmChangelisteners (on Realm/RealmResults/RealmObject) to detect if the db changed on looper thread. or waitForChange() on non-looper thread. Otherwise, although from timing wise the db is changed on thread A, the changed notification is not necessarily to be delivered to thread B.
@beeender this is my reason I am considering generics or a memory/GC. The models are being created successfully and written to realm successfully. All of the values in the model are written successfully and saving to the realm except the reference to the child object.
Looking at the original code snippet:
realm.executeTransaction(Realm.Transaction { realm ->
val lastChildJobFound = getLastChildForParent(realm, parent_id) // runs realm query to get managed child object
if (lastChildJobFound == null) { // no child exists. Create new parent.
realm.copyToRealm(NewJobModel(unique_primary_key, parent_id))
} else { // child found. Link to child.
val newChildJob = realm.copyToRealm(NewJobModel(unique_primary_key, parent_id)) // first, create new child, now I am going to link below.
lastChildJobFound.linkNewChild(newChildJob)
}
})
The call to linkNewChild() is what creates the link. That code looks like this:
class FooModel : RealmObject() {
var child_foo_model: FooModel? = null
override fun linkNewChild(newChild: ChildInterface?) {
if (newChild is FooModel) child_foo_model = newChild as FooModel
}
}
Setting a variable using Kotlin generics, in a RealmObject subclass. The other values of the RealmObject are written to realm successfully.
I have done the following:
override fun linkNewChild(newChild: ChildInterface?) {
if (newChild is FooModel) child_foo_model = newChild
if (child_foo_model == null) Log.d("foo", "newChild is null.")
}
and the console does not print. Therefore, child_foo_model is set in memory.
Is the object being garbage collected and the realm reference is lost?
Timing would make sense because of realm being 'flaky' where 75 - 90% of the time the links get saved to realm successfully. Timing could be the reason for only working somtimes. However, as you saw from my original code snippet, I am writing via 1 transaction. The UI thread getting updates has not been an issue as it's using an Observable.
@beeender I created a demo app for you! https://github.com/levibostian/RealmFlakyDemo
As you could tell from the screenshot I took from Realm browser, some links are nil. The objects are all being created successfully. All values are set and saved successfully in realm except for some of the links. Everything else is great.
The readme of my demo app, https://github.com/levibostian/RealmFlakyDemo, has been updated with an important update. I am almost certain I have found a solution that works this time. After running the app and performing a few hundred realm transactions, all links are populated. No nil values so far when before I could find a nil link in first 5 - 15 transactions.
This commit is the commit I am talking about. I refactored the code to perform 1 realm transaction instead of 2 separate ones. I am creating a parent model, creating a child model, and linking them both together all in the same transaction now instead of 2 separate ones. This makes me think indeed timing is the issue.
One of my hypothesis was because Realm was running 2 transactions very close together (via RxJava .andThen() combining) which was causing some timing issues.
This issue closely resembles this issue I created as well. With both of these issues I am continuing to wonder, why does Realm perform this way? Why can it be that sometimes running 2 separate realm transactions close together writes data successfully while other times is doesn't? Especially because realm is not throwing any exceptions, it is instead writing nil to the realm database. From my code, I am checking if the realm model is null before I do the linking:
val parentToLinkTo = getParentToLinkTo(realm, pendingApiTask.getPrerequisites())
if (parentToLinkTo == null) {
realm.copyToRealm(pendingApiTask)
} else {
val managedChild = realm.copyToRealm(pendingApiTask)
(parentToLinkTo as PendingApiTask).linkChild(managedChild)
managedChild.is_linked_to_other_model = true
// Here to fix issue where Realm was flaky. I could set next pending API task and it linked great. But some time in the future, that link would go nil.
realm.copyToRealmOrUpdate(parentToLinkTo as RealmObject)
realm.copyToRealmOrUpdate(managedChild)
}
Checking if parentToLinkTo == null I expected this to catch these scenarios when the object was indeed nil but realm still writes nil to the database, sometimes.
I expect that after realm.executeTransaction() is complete, the data has been written to the realm and I can query or write data immediately after that. I also expect realm to throw an exception if it was going to write nil to the database, sometimes, or give me some warning? If indeed me merging the 2 realm transactions into 1 has solved my issue, that still seems flaky to me.
I am testing your project with fab4202c00f5 (WITHOUT your fix ), but strangely I cannot reproduce it with emulator. 135 parent model created without any nil child... Will try it on some real device.
I am testing with: google emulator API 25 x86_64
@levibostian I'm not sure that it is related to this issue, you should not use created_at to find the most recently created object. There is a chance to exist multiple objects whose value of created_at is the same (Realm is so fast!). And of course the wall clock is not monotonic increase.
@levibostian I cannot reproduce it on the real device either ...
BTW. The Realm Browser I am using is 2.0.1(79)
@beeender
I checked out commit fab4202c00f5 as you did. Here are the results:
nil values found.nil values. This is the device I did all my previous testing with in this issue. Attached the realm file I pulled from the device. nil value. Attached the realm file I pulled from the device.I am using the Realm Browser 2.1.2 (85).
@zaki50
This is something I have in mind and have in the backlog now. I thought this would have been an issue earlier but it has been working for some reason. When the links are not nil, they are correct even if they have the same date. Realm must be saving milliseconds even though in the Realm Browser it does not show them.
@levibostian Just need to confirm with you since I do have troubles to read RxJava2 :P
protected fun getParentToLinkTo(realm: Realm, prerequisites: List<Class<RealmObject>>): Any? {
var lastPendingApiTask: RealmObject? = null
prerequisites.forEach {
val pendingApiTasksOfType = realm.where(it).findAllSorted("created_at", Sort.DESCENDING)
if (pendingApiTasksOfType.count() > 0) {
if (lastPendingApiTask == null) {
lastPendingApiTask = pendingApiTasksOfType[0]
} else {
if ((lastPendingApiTask as PendingApiTask).created_at.before((pendingApiTasksOfType[0] as PendingApiTask).created_at)) {
lastPendingApiTask = pendingApiTasksOfType[0]
}
}
}
}
if (prerequisites.count() != 0 && lastPendingApiTask == null) {
throw RuntimeException("CANNOT BE HERE!!");
}
return lastPendingApiTask
}
From the logic, it seems the only way that can be nil is it touchs the line i throw the exception. Am I correct? Even the date comparison fails, lastPendingApiTask will still have a value. Right?
@levibostian Aslo, we fixed some problem recently, if possible, would you mind to try fab4202c00f5 with our latest release on your Galaxy Nexus device?
I will try all my devices tomorrow, since not all of them are rooted, so hope the exception I added will catch the problem.
@beeender
No, the code you have:
if (prerequisites.count() != 0 && lastPendingApiTask == null) {
throw RuntimeException("CANNOT BE HERE!!");
}
Will not work. prerequesities can be > 0 and lastPendingApiTask can be null at the same time. I removed all of the job running code from the demo app because it doesn't matter for the demo purposes. getParentToLinkTo() finds a link, but it may not be there. The potential parent task may have already ran and been deleted from Realm.
In this demo app, yes, this code will work and a parent will always exist. But in my production app the demo is created from crunches the jobs and a parent may not exist.
My Pixel is not rooted and I did my testing. Check out my wiki page to pull the realm file from a non-rooted phone.
I am upgrading the realm version now and running all the previous tests again.
@beeender
Ran commit fab4202c00f5 with realm version 2.2.2. Results
nil values. Last time didn't have any. nil values. Same as last time.nil values. Last time found 1. @levibostian Thanks. i will do some more testing tomorrow.
BTW. The realm instance in performRealmTransaction is not closed. but since you only access data in the transaction block where the db should be the latest always, it should not be a cause to this problem. But be careful in production code, it cause unexpected size of db file, leaks, also some strange data version issues on non-looper thread.
Oh? I thought that running a realm transaction closed it for me after the clock was done executing. I will add a realm.close() statement after the executeTransaction() block in prod code.
Realm files from running demo app with latest commit: 72eeb493b5b1 known as the fix commit. Ran on all 3 devices again. No nil values found for any of them.
Today I will try this fix in my prod app and see how it runs. Then I can at least move forward with my app while we work on improving the flakyness here.
Thank you for all the help, team. You are all super helpful!
@levibostian Do you have any updates to share?
@kneth This issue is actually pending on me. I haven't got time to investigate more on @levibostian 's demo project yet.
This issue is on hold, pending @beeender 's investigation. No ETA yet.
@levibostian Sorry for taking such long time to get back to this issue.
After checking your project on fab4202c00f5, I think the problem is just like what you fixed in 72eeb493b5b1 -- there were two different transaction creating the parent and child separately.
transaction_1transaction_2 - link child to parent in transaction_2So when you do adb pull, the realm file could be in the state that transaction_1 just finished but transaction_2 has not been started or it is in progress. In both cases, you can only see the data which has been committed in transaction_1 in the Realm browser. at this time, the parent's linkToChild has not been called, so the child could be nil.
The fix72eeb493b5b1 which performs both parent & child data change in one transaction should just fix the problem since whenever you pull the realm file, the child should be linked to the parent since they are written in the same transaction.
Thank you for giving it thought, @beeender.
I can see that scenario happening. Especially with the devices I tested on I had new/fast devices and slow/old devices that were giving me different results. Race condition would make sense.
I cannot see a way to make an improvement to the API for the realm SDK to prevent this from happening in the future. Have the realm team seen others discuss this same issue before? If so, would it be helpful to stick in a helpful tip in the docs? If I am a loner, we can close the issue and hope I never do it again, haha.
Thank you!