Realm-cocoa: Confine on dispatch queue rather than thread

Created on 3 Nov 2014 · 58Comments · Source: realm/realm-cocoa

tl;dr Realm uses thread ids and throws exceptions when they change. Checking thread ID is not acceptable under GCD.

Realm docs state objects are not gcd queue safe and when running under GCD:

you must get an RLMRealm instance in each thread/dispatch_queue in which you want to read or write

I do this:

    // Local copy of the persisted task valid for the worker queue.
__block OFEPersistedTask* taskCopyOnWorkerQueue = nil;

NSString* taskID = aTask.taskID;
dispatch_sync(_workerQueue, ^{
    NSPredicate *pred = [NSPredicate predicateWithFormat:@"taskID = %@", taskID];
    RLMResults *results = [OFEPersistedTask objectsWithPredicate:pred];
    taskCopyOnWorkerQueue = [results firstObject];
}

To get a copy of my Task (RLMObject subclass) on the correct queue. Unfortunately, in RLMVerifyAttached Realm checks the thread ID through RLMCheckThread (i.e. not queue) which does not work under GCD.

A GCD queue can run its tasks on many different threads - thats kindof the point. From the Apple docs:

The currently executing task runs on a distinct thread (which can vary from task to task)

My emphasis... So, even if I grab a RLMObject on my queue I still risk getting killed (actually, no risk, I do get killed) later for running on the wrong thread.

The only workaround I can think of is to store a primary key with __block scope and then perform another Realm fetch at the beginning of every task. That feels so suboptimal I don't think you can claim Realm is compatible with GCD _at all_.

Blocked T-Enhancement

Source

rogernolan

Most helpful comment

If you don't hold on to an RLMResults by passing it directly to -[RLMResults addNotificationBlock:], the query will never be executed on the main thread, it will only ever be executed in the background, and the notification block will then hand over those results back to the main thread:

// on main thread
self.results = nil;
self.token = [[MyModel objectsWhere:someExpensivePredicateQuery] addNotificationBlock:^(RLMResults<MyModel *> *results, NSError *error) {
    // back on the main thread
    self.results = results;
    [self.myViewController updateUI];
}];

jpsim on 18 Mar 2016

👍5

All 58 comments

Fetching the object by primary key in each task is probably less inefficient than you're expecting if you're used to other DB systems, but it would indeed be nice to be able to avoid it. The obvious idea of pinning Realms to dispatch queues rather than threads doesn't work since you can't actually reliably determine what queue you're running on (hence why dispatch_get_current_queue() is deprecated). We'd have to wrap the GCD dispatch functions, which of course is awkward and doesn't play well with others.

We're actively working on functionality for passing query results between thread which'll sidestep the issue somewhat, but it probably won't be a totally transparent thing.

tgoyne on 3 Nov 2014

Agree that wrapping gcd functions is not desirable. Passing objects between threads is obviously the way to go. Having that as seamless as possible* is important since threads are getting less and less visible/relevant

In the meantime, this should be a documentation bug report. Your docs imply that this should work.

* I'm sure it's not going to help but one pattern I'd be happy with is to have as-simple-as-possible read only access on a different thread and take the reload hit when I need to write. I'm imagining a method like -(instanceType)threadSafeReadOnlyCopy;. I could call this once at top level scope before creating my gcd blocks.

rogernolan on 4 Nov 2014

I have similar issue with my SAX-parser. And I don't want to subclass NSThread, cuz it will be a lot more complex. My code sample:

@interface JsonSAXParserDelegate ()
@property(nonatomic) dispatch_queue_t queue;
@property(nonatomic, strong) RLMRealm *realm;
@end

@implementation JsonSAXParserDelegate {
    RLMItem *_currentItem;
    NSThread *_check;
}

- (id)init {
    self = [super init];
    if (self) {
        _queue = dispatch_queue_create("com.RLMWriter.queue", NULL);
        dispatch_async(_queue, ^{
            _check = [NSThread currentThread];
            self.realm = [RLMRealm defaultRealm];
        });
    }

    return self;
}

- (void)startJson {
    NSLog(@"parserFoundJsonBegin");
    dispatch_async(_queue, ^{
        if (_check != [NSThread currentThread]) {
               NSLog(@"Thread changed. Get ready to crash!!!");
        } 
        [self.realm beginWriteTransaction];
    });

}

- (void)startObject {
    _currentItem = [RLMItem new];
}

- (void)endObject {
    RLMObject *item = _currentItem;
    dispatch_async(_queue, ^{
        if (_check != [NSThread currentThread]) {
               NSLog(@"Thread changed. Get ready to crash!!!");
        } 
        [self.realm addOrUpdateObject:item];
    });
}

- (void)parseObject {
    //fill _currentItem
}


- (void)endJson {
    dispatch_async(_queue, ^{
        [_realm commitWriteTransaction];
        NSLog(@"Items - %i", [RLMItem allObjects].count);
    });
}
@end

Some times it works, sometimes I catch exception...Any way to make it works?

I've commented RLMCheckThread method, and now my code works without issues, but I assume that was a bad idea...

_UPD:_ this link may be useful http://stackoverflow.com/a/26831567/1135154

Kirow on 8 Nov 2014

Absolutely makes sense. Break on those unhandled exceptions and you'll see the queue is the same but the thread changes. Realm is not GCD safe.

I've commented RLMCheckThread method, and now my code works without issues, but I assume that was a bad idea...

Very bad idea. The check is presumably there for a reason. I'm guessing that sometimes a property fetch will fail and the access to the realm to back fill that thread is going to fail. Although this check looks like it's just throwing an exception when it doesn't _need_ to. It's actually alerting you to a bug you have which if left would be much more difficult to find even if it wouldn't happen every time.

The solution is the same for you as it was for me: in your SAX parse, store the primary key of the object you are currently building and fetch it from Realm at the start of every operation. Like @tgoyne said, it's pretty fast (and not a lot of extra code).

rogernolan on 10 Nov 2014

The check was added specifically because without it things appear to work most of the time, but you're almost guaranteed to eventually hit crashes and we really don't want people to discover that they have to heavily rework their app late in the development process (or even worse, after shipping to users and getting crash reports) to avoid something they thought was fine.

For your specific use-case, is there any reason that you can't just run the whole thing in a single GCD dispatch? There isn't any obvious reason why the individual bits need to be async.

tgoyne on 10 Nov 2014

I don't see the way to make it in a single GCD, because of delegate event-based structure. Only if i will make transaction for each object, but this is bad idea. Other way - to create thread for it, but I don't have much experience with it and my current implementation loss ~10% of performance. Maybe I should think about how to make parser block-based, this will probably helps me to use only one GCD block.

Kirow on 10 Nov 2014

I also encountered this issue where I get a RLMObject in one thread (not relevant) and then I start some work on another dispatch_queue and I get "Realm accessed from incorrect thread" when reading values of the object. It would be so nice to have a "queue-safe read-only copy or view of the object" here...
Any solution?

vittoriom on 14 Jan 2015

I also encountered this issue and would be very interested in a solution.

ebluehands on 16 Jan 2015

@vittoriom and @ebluehands: we're actively working on this functionality in the form of "thread handoff": e.g. run a query which returns objects on separate threads. We'll post to this issue once we have more to share.

In the mean time, you can work around most of these issues by passing around primary keys to objects between threads rather than the objects themselves (similar to using NSManagedObjectIDs in Core Data). This does mean that you'll have to re-fetch the object on the target thread, which we understand is sub-optimal, which is why we're actively working on thread handoff.

jpsim on 16 Jan 2015

@jpsim thanks for the follow up, I did fetch the object on the target thread indeed, and for now it will do the job. But it's in the middle of a complex async set of blocks dispatched in a dispatch group so it would look much cleaner if I could just share the object (in read-only mode) on other queues as well.
I'll watch for more comments on this issue! Have a good weekend

vittoriom on 17 Jan 2015

@jpsim, @tgoyne, what about entities that don't have a primary key defined? AFAIK, NSManagedObjectID exists for every NSManagedObject once it's been added to the DB. Primary Keys, on the other hand, are optional. In some cases it's even recommended against having PKs for Realm entities: https://github.com/realm/realm-cocoa/issues/1234#issuecomment-67192865

ghost on 28 Jan 2015

In some cases it's even recommended against having PKs for Realm entities: #1234 (comment)

That comment was from a while ago. Primary keys are supported and you're encouraged to use them in cases in which you need a unique identifier for your objects.

We've opted against going the Core Data route of unconditionally adding an additional column on each model for the sake of global object ID's, but you're free to do so yourself.

The thread handoff solution we're working on will support all objects, not just ones with primary keys.

jpsim on 28 Jan 2015

I'm a bit confused about what the current status of this is? I've been using realm with GCD by grabbing the id of the object before the dispatch and then querying for it on that thread. But this can get messy and sometimes I don't have the id. Plus, this can get quite messy keeping track of which thread I'm on. What's the current best practice for using realm on another thread to keep it async from the UI?

tettoffensive on 17 Mar 2015

Not sure it's a best practice, but what I'm doing in the project I'm working on is:

Persistence layer has a reference to a "realm factory". In the simplest case, it just returns RLMRealm.defaultRealm everytime. (Added bonus, in integration tests I can inject a "test factory" that reads and writes to a bundled database with known content)
Persistence layer driver begins every read/write on the persistence layer with a writeInTransaction or readInTransaction call. This internally just dispatches the closure argument to a background queue, or calls transactionWithBlock on the realm.
All the reads and writes done in the persistence layer reference the factory-generated realm (so there's no problem of accessing a realm on the wrong thread at any time)
Immediately after reading some data, the persistence layer converts the RLMObject object to a lightweight struct so that it can be exposed to outer layers and passed around threads without any danger

Hope it helps :)

vittoriom on 17 Mar 2015

@tettoffensive I think the easiest way is to call your realm on each thread. let realm = RLMRealm.defaultRealm(), then fetch the objects you want on that thread. If you want to pass objects, you can create an NSArray of primary keys and then fetch them again from the main thread. For the most part, Realm is really fast to be run on the main thread, so you don't have to worry about performance issues

yoshyosh on 17 Mar 2015

@yoshyosh the problem with this approach is that one would usually need the objects corresponding to the IDs in the same order that the IDs were, which is not currently available. There is no + (instancetype)objectsInRealm:(RLMRealm *)realm forPrimaryKeys:(NSArray *)primaryKeys
Example: if the heavy lifting is done in a background thread and as part of that work the results are being ordered. Then, on the main thread, I get an array of IDs which I want to expand to full objects. And I believe, more often than not, I'd want the expanded objects to be in the exact same order.
See https://github.com/realm/realm-cocoa/issues/1325
I.E. you would have to re-order the results on the UI thread or fetch them one by one and assemble an array of results.

ghost on 17 Mar 2015

I wanted to use the DAO (DataAcessObject) pattern with Realm to encapsulate common persistence operations.

Create a protocol that defines createOrUpdate, findBy, listBy, delete . .
Create a Realm implementation

It didn't go well because the returned results would be accessed from another thread. I'd need either:

Copy to another thread ability.
Ability to 'detach' objects from Realm to behave like regular in-memory objects, and subsequently reattach (have seen this in some ORMs in other languages - maybe adds unwanted complexity, but just mentioning).

jasperblues on 31 Mar 2015

It's been a while since there has been any discussion on this, so, here's me adding my +1

kunalsood on 6 Jul 2015

We just merged support for delivering results from one thread to another in the underlying database engine, and I've been working on exposing that in the Cocoa binding.

segiddins on 6 Jul 2015

@vittoriom

Immediately after reading some data, the persistence layer converts the RLMObject object to a lightweight struct so that it can be exposed to outer layers and passed around threads without any danger

Curious as to what this looks like in practice? IMO your solution is the ideal one, but do you store non-ORM blobs in Realm, or do you have two forms of each Model, one that subclasses Realm and one that doesn't (the latter being your immutable struct)

zdavison on 18 Jul 2015

@zdavison

The latter option, we have two "forms" (I like to call them "views", since for some complex cases we have even more than 2) of each model. We have a subclass of RLMObject for our persistence layer and a simple immutable struct to pass around threads and functions. Also, we have an object that takes care, given an object of type 1 (subclass of RLMObject), to return an immutable struct of type 2. This way we leave the two things very dummy and passive (plus, immutable structs have a number of advantages), and we keep the knowledge to convert one to another in a separate testable object.

I know it may look like a lot of work for every new model object, but eventually it's mechanic enough that one can do it in a few minutes and it brings simplicity and advantages that cannot be beaten IMHO. Also, as I wrote before, for very complex model objects (with that I mean objects that have a high number of relations and nested and nested children), we have not one but two struct "views", a lightweight one to pass around when you need just some kind of "metadata" and the full one when you need to access every bit of information :)

vittoriom on 18 Jul 2015

Not sure if this idea makes sense as I haven't been using Realm for long, but would it make sense to implement detach and attach methods? Where detach makes the object into a PONSO (plain old Objective-C object) and attach reattaches to Realm. So if I'm about to call a block:

[myRealmObject detach];
[myService doSomethingWithResult:^(id result) { 
   [myRealmObject attach]; //Reattaches using primary key
       //continue
   }

When detaching there'd need to be rules around eagerly loading related objects/collections:

Throw an error
Perhaps allow detach and loading specified related entities eagerly.

Again apologies, if I've misunderstood something as I'm new to Realm.

jasperblues on 2 Sep 2015

You can essentially do that now by passing a persisted object to an RLMObject initializer, initializing an in-memory object with the value of a persisted one, or vice-versa.

jpsim on 2 Sep 2015

@jpsim Awesome - can you point me at a code sample?

jasperblues on 2 Sep 2015

Person *inMemoryPerson = [[Person alloc] initWithValue:persistedPerson];

jpsim on 2 Sep 2015

Thanks @jpsim . . that looks ok. What I was doing before was:

NSString *uuid = person.primaryKey
[service doSomethingOnNewThread:^{ 
    person* person = [Person objectWithPrimaryKey:uuid];
    //continue
   }

jasperblues on 2 Sep 2015

If all you need to do is re-fetch an object, your approach is preferable. The reason Realm doesn't do this automatically for you is that the object may have changed between those two threads, so keep that in mind.

jpsim on 2 Sep 2015

Thank you @jpsim

jasperblues on 2 Sep 2015

@jpsim do you mean the in-memory object is safe across thread?

siuying on 9 Sep 2015

@jpsim do you mean the in-memory object is safe across thread?

Standalone RLMObjects are just plain old NSObjects, with the same threading implications as if you were inheriting from NSObject.

jpsim on 9 Sep 2015

@jpsim should creating a standalone object by using initWithValue work also if the object has a list of persisted child objects? Those don't seem to be created standalone, or is there some trick to it?

anlaital on 9 Sep 2015

@anlaital no, that would've been implemented via https://github.com/realm/realm-cocoa/pull/2043 but we decided not to merge it

segiddins on 9 Sep 2015

@segiddins what's the recommended approach for detaching a complex persisted object and doing changes on it that may or may not be persisted in the future?

anlaital on 9 Sep 2015

@anlaital you'll have to do your own recursive traversal of the object graph, creating standalone copies of every object you encounter -- if you really do want a 'detached' copy of your object graphs

segiddins on 9 Sep 2015

Hi, I just wanted to ask how is this going on. As it was stated in some other open issue now closed, relying on fetching with primary keys is far from from optimal. Specially since every time that fetch is performed, a new instance is created.

Do you have any update on the work you're doing for native cross-thread object sharing/passing?

jrpinteno on 24 Sep 2015

We're working on this. There's some major architectural work needed for this to happen.

jpsim on 24 Sep 2015

Thanks for the update - just want to upvote priority for this issue as well. I recently decided to migrate to Realm, and this made me have to do all the work in the main queue because that's the only queue guaranteed to be executed on the same main thread. Had to do lots of Object(value:) copies for background stuff, and had to change from object references to storing child object IDs - the lack of this ability basically changed how I lay out the whole data flow.

hyouuu on 25 Sep 2015

I'm currently making immutable copy of view objects (as Swift struct) that is safe to pass across threads. Built in ways to handle cross thread use case would definitely helpful, as we have more cores than ever now on iOS.

siuying on 25 Sep 2015

@hyouuu just wanted to ask, what is the problem of executing realm queries on some particular background thread and do these Object(value:) copies to deliver objects to the main thread instead?

ReDetection on 25 Sep 2015

@ReDetection if you will execute queries using GCD - this will not guarantee single thread. If you will use nested NSThread for this task - this will be much more slower. So I think GCD serial queue support is highly required. In fact if I remove exception from realm code, it works fine, but I were told not to do that.

Kirow on 25 Sep 2015

@ReDetection - @Kirow explained it well - dispatch to a particular queue doesn't guarantee it will be the same thread, except for the main queue

hyouuu on 25 Sep 2015

@hyouuu after I migrated to Realm, I refactor lots of code from passing objects across thread to passing object ID instead. It's a bit annoying as many of those code now coupled with Realm fetch and save logic. But you can safely offload heavy tasks to other queues.

siuying on 25 Sep 2015

@siuying initially I kept my parent - child using List

Realm-cocoa: Confine on dispatch queue rather than thread

Most helpful comment

All 58 comments

Related issues