Realm-cocoa: Syncing - Distant Future?

Created on 16 Sep 2014  Â·  174Comments  Â·  Source: realm/realm-cocoa

Super long term, maybe not even practical, but perhaps we could see a syncing mechanism similar to Ensembles. Ensembles is really great for syncing, and as of now it would be the biggest reason for me to use Core Data over Realm for projects that require syncing.

T-Feature

Most helpful comment

Hey Everyone, we've made a first attempt at the syncing problem. We have a small library to handle the syncing, packaged in a node module.

Our solution is to wrap the create, update, and delete realm methods to provide additional logic for our syncing algorithm. It adds unique ID's to realm objects that need to sync and puts changes in a schema-less queue that can be synced remotely and resolved when a device needs to sync.

Again, It's just a start, but I think it can be helpful to anyone looking to get syncing to work in their apps. Would love suggestions.

https://github.com/KiteSync/realm-sync-js
https://www.npmjs.com/package/realm-sync-js

All 174 comments

UPDATE August 2015: if you are interested in this feature and want to voice your interest, please help us by taking the time to write a paragraph or two explaining how _you_ expect this feature to work (from your point of view) — it will be tremendous help! At this point, understanding specific usage scenarios is key to us delivering a working implementation.


Realm’s internals are actually built for sync. As we publicly stated, we’re hard at work on that at the moment but we’re not quite ready to talk about it yet. We’ll probably hold private preview events about sync in the near future, like we are doing next week for Realm for Android.

Cool!

Donald Pinckney

Sent from my iPhone

On Sep 17, 2014, at 9:21 AM, Tim Anglade [email protected] wrote:

Realm’s internals are actually built for sync. As we publicly stated, we’re hard at work on that at the moment but we’re not quite ready to talk about it yet. We’ll probably hold private preview events about sync in the near future, like we are doing next week for Realm for Android.

—
Reply to this email directly or view it on GitHub.

Just to add our two cents - following on from your RESTAPI examples, it would be great to be able to keep a _partial_ dataset on the client - some frameworks such as Firebase keep a complete replication of the data on the client, whereas Parse allows you to 'pin' data to your local store (although Parse don't have a strong focus on sync at the minute) - if you imagine using your FourSquare example with Parse-sync then keeping the whole DB replicated on the device isn't going to be possible.
Do you think partial DBs are likely to be supported?

In terms of back ends, I appreciate it's hard to create something super generic, but we'd love to be able to use the GAE for our servers with the possibility of switching to other providers at a later date. As such, a well documented REST API that realm-sync requires to work would be great to allow us to implement if one isn't already available.

Could we have any rough timescale on sync? We're about to start rolling our own over here...

Also, can you provide a unicorn factory method please ;)

Hi Sam, yes, we’re considering partial DBs and developers’ needs for flexible hosting options.

Re: timescale, sync should be coming out right before our unicorn factory method ;) In all seriousness, it’s too early for us to commit to any sort of timeline but we’re working actively towards it!

any updates on sync?

Nothing we can share publicly yet, but we’re making good progress. This is a complicated feature and we want to make sure it meets our expectations. Aside from the public preview events we’ll hold, we’ll probably also pick a few people that have asked about the feature to become beta testers, so add your :+1: here if you are interested. UPDATE August 2015: :+1:'s no longer needed but feel free to subscribe to the issue if you want to be notified of any betas.

:+1:

:+1: added then!

:+1:

:thumbsup:

👍 definitely

Am 12.01.2015 um 17:32 schrieb Tim Anglade [email protected]:

your

We used Core Data in the end and rolled our own sync so I fully appreciate the complexity!

Still i'd like to add a :+1: as I think I could put realm sync through some tough conditions :)

does anybody have suggestions on using realm with I. e. firebase? I thought of data sync with that. would be a perfect match. does anyone have experience there? did anyone ever do that before?

Am 12.01.2015 um 18:27 schrieb Sam Duke [email protected]:

We used Core Data (we wanted multi-process support in order to do the sync) in the end and rolled our own sync so I fully appreciate the complexity!

—
Reply to this email directly or view it on GitHub.

:+1:

👍👍👍

:+1:

:+1:

:thumbsup:

:thumbsup:

:+1:

:+1:

:+1:

:+1:

:+1:

:+1:

:+1:

:+1:

👍

image

:+1:

:+1:

Image of Thumbs up for beta testing

:+1:

any update on sync ?

:+1:

:+1: are getting more creative overtime :dancer: any update? :smile:

:+1: :+1: :+1: :+1: :+1: :+1:

:+1:

:+1:

:+1:

👍

@timanglade Do you have any update for us ? Don't want to go back to the core data again for my new project.

Sorry, nothing new to report here, we see the upvotes and take (very) good
note of them, but as you can tell from our activity elsewhere on this
project, we are focusing on building the best possible on-device database
first, and we still have a long way to go there. We don’t want to
compromise the experience you have by adding too many big features too
quickly, but we understand that sync is something that is extremely
important to y’all.

On Sun, Apr 5, 2015 at 12:32 AM, Tariq Mohammad [email protected]
wrote:

@timanglade https://github.com/timanglade Do you have any update for us
? Don't want to go back to the core data again for my new project.

—
Reply to this email directly or view it on GitHub
https://github.com/realm/realm-cocoa/issues/913#issuecomment-89730305.

:+1:

Thanks for the update @timanglade. Much appreciated.

👍🏼👍👍🏻👍🏽👍🏾👍🏿

:+1:

:+1:

:+1:

:+1:

:+1:

:+1:

:+1:

:+1:

👍🏻

:+1:

:+1: I am looking into CouchbaseLite at the moment specifically due to its support for built-in sync (plus doesn't require iCloud). But I'm still keenly interested in Realm too. YapDatabase has CloudKit sync, but I don't want to be limited to just iCloud.

:+1:

👍👍👍👍

👍🏼

:+1:

👍👍 please :+1: :+1:

:+1:

:+1::+1::+1:

:+1:

:+1:

:+1: :+1: :+1: :+1:
I'm looking forward for this feature.

👍👍🏻👍🏼👍🏽👍🏾👍🏿

:+1:

:+1:

@timanglade any update on this?

No update yet, no. Instead of :+1:s it would be very helpful if people could write us at least a sentence (ideally a lot more) explaining how they would like this feature to work.

Here is a scenario. I have an app in development. It is a vehicle mileage and expense tracking app. It saves everything to the Realm Database. However, sometimes, my wife might drive my car, so we would both have the app and both need to update the "Vehicle" object that represents my car. I would be nice to provide a way to sync that. This is probably an easier sync than most others, because nobody is likely to be updating the mileage or oil change on a single vehicle from multiple places at the same time. But you never know, if a phone needs to be restored, I want to keep all "shared" vehicles synchronized between all devices in question.

For my use I would like something like ensembles.io with peer-to-peer sync and compatible with various back-end services like Dropbox, CloudKit, own-server...
ensemble.io is a replacement and huge improvement to iCloud Sync and it's very easy to add to existing Core Data projects but Android is left out and that's why Realm with sync would be ideal.

Some essential features of the sync mechanism I'd like to see:

  • Data to be synced across multiple clients (iOS, Android, Web, WP7 etc)
  • Roles/permissions for objects
  • Sever code (similar to Parse's cloud code utility) in order to define validation rules or execute code which has to the ability to bypasses roles/permissions

The above could be accomplished by a paid server hosting service by Realm (BAAS?) or an option to deploy a server library to your own server (similar to what Couchbase does).

Another solution could be to create multiple REST API examples of how Realm can be used, eg. todo list with relationships being synced or a social networking app based off Parse (or another BAAS/REST API).

Hope that gives some more helpful suggestions @timanglade!

I've already added a couple of our cents to this, but to summarise and add a little:

  • Partial sync
  • Flexible hosting (GAE compatibility is a MUST to work with the rest of our server stuff) - a documented REST API ideally - RAML? APIBlueprint? different on the wire formats - json? protobuf? flatbuffers?) and maybe your own implementation as suggested by @kermankohli
  • Customisable merge conflict resolution strategies - exposure to these internals will be very important.
  • Cross platform. And of course, the only reason for picking this over core-data is it works on multiple platforms

I'd like to be abile to sync a collections of items between devices and platforms using the simplest conflict resolution rules where the last change wins.
It would be great if the sync could be done using multiple services like Dropbox or a self hosted server or a Realm hosting service.

:+1:
Why? So that the data is kept and even if the user changes device, has more than 1 device or removes the app and installs it again, the data is still there.
How? I add a setting in info.plist saying "RealmPersistant: TRUE" and done.

@desduvauchelle I don't think that would be possible as you describe. You need some way of identifying who the data belongs to. I.e. you need some login mechanism.

A very simple sync service that:

  • can be self hosted or provided as a service
  • follow simplest conflict resolution rules where the last change wins
  • includes a web UI to look into various DB repos, export, debugging, etc.
  • Cross platform (but not a deal breaker)
  • so easy to setup and use that the village idiot can figure it out. : )
  1. detailed diffs / change notifications are important (what has changed after a commit)
  2. a diff export / import mechanism (ideally) - from a detailed change notification generate a diff (dict or directly json) which you can handle manually (uploading it to the favorite store). Importing a diff should be possible too, then (where the diff comes from is not important for realm)

That way syncing is decoupled and very flexible. That way it would be possible to implement i.e. encrypted sync.

Realm could, based on top of that, implement a all-in-one syncing solution. But this should be optional.
It would be nice if devs could leverage their own syncing service based on top of changes in the database!!!

:+1::+1::+1:

:+1::+1::+1:

@davidjoan @andr3a88 : Great if you could also add a few notes about your particular needs. Thx!

Support for sync on Apple watchos2 and either a full copy or subset of the the realm database on the phone would be awesome . This would need to be bi-directional

Would anyone be interested in working on making an adaptor for Parse + Realm?

:+1: Ensembles used in my app Grid Diary. Would like to see something like that in Realm.

@davidjoan

Initially, if it is easier a backup/restore service from external services like dropbox, gdrive and icloud.

I've been thinking about this feature for a while and here are some thoughts, getting kind of specific...

CP vs AP

Most users will want the local database to be available during partition (AP). We definitely need the database to be available to support offline usage of a synced Realm database. So that means that the principle to struggle against is going to be consistency (and go with eventual consistency).

Having CP is also possible, but that requires that the app doesn’t allow for writes to the local database, and offline mode is a key feature for most apps. But getting arbitrary clients all fully connected sounds like a nightmare so I don't know if this is even worth supporting.

With AP assumed...

The difficulty is guaranteeing SEC for offline local databases. Sync is partly so hard because there are so many different strategies that might fit some user data better than others.

I think that the best theoretical framework for this type of database model is CRDTs (https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type), because you're basically syncing a highly partioned server system with high fault probability and long periods of partition. The phone isn't a client, it's a replica. And I apologize because the rest of this comment is going to go with my opinion that CRDTs are the right theoretical device for this tool.

But there are many different types of CRDTs, and which CRDT to use depends highly on the use case for that data. So Realm should let users decide which CRDT they want to use for their data to ensure SEC. It would be great if the framework could support arbitrary C(m|v)RDTs for arbitrary Realm synching, since both of these are defined by a few simple operations that a server could arbitrarily run (like map reduce, in some ways), but that’s maybe too complex for the average user. At the very least there should be support for arbitrary merge on detected conflicts -- and some way to escalate to the human user if a merge shouldn't be automatic. But there are some very common patterns that most Realm users will want

Common CRDTs that should be built-in

  • Removing/adding objects to/from a set. This is a really big one for syncing — adding and removing from local Lists to get relationships. Some databases have opinions what users want (like Riak), but it’d be nice if there were options here.

    • Add only (state based grow only set) — smaller data footprint

    • Add, remove with precedence for add or remove (AWORSet, RWORSet with improvements for size of set)

    • Union/intersect of sets (MVRegister) (classic amazon cart example)

    • Add remove with timestamp (LWW) (looks great but timestamps can be problematic because of the different notions of time across clients)

  • Changing a timestamped object or object field where last timestamp wins. This should either be a user-specified timestamp OR a system defined timestamp. LWW again.
  • Edit: Forgot this one! Version controlled object, rather than timestamp controlled. This uses dotted vector clocks (http://haslab.uminho.pt/tome/files/dvvset-dais.pdf).
  • Counters. Not a Realm data type, this really should be supported.

CRDTs that might be built-in

  • Changing a string on the character level (i.e. text is treated as a linked list of characters rather than a single object). This again has a bunch of known CRDTs depending on the types of string edits, thanks largely to research on documentation collaboration. Treedoc (https://hal.inria.fr/inria-00445975/document) and Logoot (https://hal.inria.fr/inria-00336191/PDF/main.pdf) are commonly cited. This would be really awesome, but is also obviously complex.
  • Changing arbitrary numerical values. This covers bool, int8, int32, int64, double, float, and dates. This one I think has a CRDT for arbitrary add/subtract — and can take advantage of the fact that to the application layer, it looks like there are only two servers, local and remote. This also might be hard to scale… depending on the number of clients. This is different from timestamp update because some operations might be too fast to be able to trust local notions of time. Devices vary in how accurate their timestamps are to true time, so it might be better to have an arbitrary and guaranteed way to change numerical values without timestamps.

CRDTs that would be awesome

  • Invariants on numerical data. Wallet applications are the biggest example here… There’s a paper on this from this year but it’s obviously non-trivial (http://arxiv.org/pdf/1503.09052.pdf) and an active area of research.

User defined merging/CRDT

  • Users should be able to define their own merge strategies between two objects if they so choose. Support for user CRDT could lead to some neat user cases.

Backend

I know others have mentioned that they would like to support arbitrary backends, and this has the potentially to be a bottleneck. So a layer on the backend has to exist that listens and directs the various messages from clients (the CRDT update packets depending on the type of CRDT chosen for that field) to all known connected clients but also to connected backend replicas, and these backend replicas need to be able to do similar types of merges as the client code because of SEC. MySQL frameworks therefore have the potentially to be a real hassle. It might make sense to build a layer on top of the NoSQL flavor of your choice to help users who don't want to do this by themselves. The only NoSQL database I know that that has CRDTs is Riak, but all support some flavor of SEC.

Coordination

Another really non-trivial feature of the backend is that it has to coordinate the various clients. This should probably be decentralized, and that means each decentralized coordinator needs to be somewhat of a load balancer (know which clients are connected and where).

Also clients can only synchronize with other clients that have the same schema version, which might be a pain for migrating data in the field (something that is not supported I think?).

The coordinator will also have to keep track of all the past changes so that when a client reconnects it can send the appropriate changes to the client. These could possibly be consolidated but a major problem is knowing how long to wait for a client to reconnect before declaring them lost. Otherwise, coordinators have to keep track of all past changes indefinitely. This is more or less the question of when to rebase. More recent changes could be cached for users that expect less latency as they are actively connected, whereas users transitioning out of offline mode might expect more latency as the coordinator gets all the past changes to update the client on and do merging.

Those are my thoughts/expectations for a feature like this... sorry it's longer than your suggested "a few paragraphs" :fearful: Happy to give more feedback if some part of this is unclear.

Great points @tom-sparo

At the very least there should be support for arbitrary merge on detected conflicts -- and some way to escalate to the human user if a merge shouldn't be automatic.
Yes this.

Interesting reading re: CRDTs. IMO, I don't think many app developers would want to opt for CmRDTs because of all the connection-based stuff. (A user might not open an app for a month or more and still expect it to work).

Dan Grover's sync presentation from '09 has been invaluable to me for getting my head around sync. See here (although this is only one of many possible CRDTs AFAICT): http://stackoverflow.com/questions/5035132/how-to-sync-iphone-core-data-with-web-server-and-then-push-to-other-devices/5164878#5164878 (the link is broken, but someone has found it again on the waybackmachine).

Thanks everyone for chiming in here, we really appreciate hearing how you could use sync.

We may not be posting here often, but we certainly are listening!

Interesting reading re: CRDTs. IMO, I don't think many app developers would want to opt for CmRDTs because of all the connection-based stuff. (A user might not open an app for a month or more and still expect it to work).

I may be misunderstanding what you're saying here, but if you're saying that CmRDTs are difficult because they need the protocol to guarantee that they aren't delivered twice or more times, this type of protocol guarantee is not too new or hard to do I THINK (never done it myself) -- but it's true it could potentially add latency for handshakes or to implement a queue system.

If you're saying that there's a period of time after being disconnected that a local database will stop working, I think there's no theoretical limit on a partition time disconnected (though there may be practical ones) -- meaning that if an app becomes disconnected, because CRDT guarantees local availability by using eventual consistency rather than strong consistency, all writes and reads to the local database are guaranteed to succeed... and when the app reconnects the replica will become consistent with all other replicas assuming that the other replicas (i.e. users on other phones) commutative state changes have been stored for when the replica comes back online (this requires a queue). So an app's local database is ALWAYS available for read/write no matter it's connected/disconnected/partial connected state.

The trick to all this is treating the local database on the phone not as a client that needs to be connected but as a replica of a database itself. This is a shift in model -- we're used to treating phones as clients and databases as something that only exists server side. In this new model, both clients and servers have replica databases and all need to be in sync. This model is becoming more common though -- Meteor for instance has minimongo instances deployed to clients that need to be synced via DDP (but protocol and merge strategies are all LWW at the moment... which is severely lacking for things like counters were it's possible and likely that increments or decrements will be entirely lost on sync... and I've seen that they have double-write problems).

The real challenge with CRDTs in my opinion is that they're theoretical and have non-trivial bookkeeping (vector clocks) and can be confusing for people used to simple merge semantics that rely on state alone. The benefit of CRDTs is that if you build an app using entirely CRDTs for fields you can have guaranteed SEC without having to write a line of merge code and without having to worry about conflicts at all. Even if a phone disconnects for years (assuming the engineering on the server side is set up properly) that phone can sync seamlessly and without user input on reconnect. So having that theoretical guarantee can be really worth the extra engineering.

But again, just a personal opinion as a potential sync user and a parallel database research fan!

👍🏻

Lacking of syncing is currently the only thing preventing me from switching from Couchbase Lite to realm.

My scenario (with Couchbase Lite): On client side, I create separate database for each user account; on server side I use one data bucket behind Sync Gateway, mapping each user's data to one channel.

My biggest pain with Couchbase's way of syncing is I have to attach a "channel" property to each document to route to different channel. IMHO, channels is only required by syncing logic and shouldn't interfere with document's schema.

I hope realm could come up with some better design to handle multi-user scenario.

@tom-sparo

If you're saying that there's a period of time after being disconnected that a local database will stop working, I think there's no theoretical limit on a partition time disconnected (though there may be practical ones) -- meaning that if an app becomes disconnected, because CRDT guarantees local availability by using eventual consistency rather than strong consistency, all writes and reads to the local database are guaranteed to succeed... and when the app reconnects the replica will become consistent with all other replicas assuming that the other replicas (i.e. users on other phones) commutative state changes have been stored for when the replica comes back online (this requires a queue). So an app's local database is ALWAYS available for read/write no matter it's connected/disconnected/partial connected state.

I was referring to your previous para:

The coordinator will also have to keep track of all the past changes so that when a client reconnects it can send the appropriate changes to the client. These could possibly be consolidated but a major problem is knowing how long to wait for a client to reconnect before declaring them lost. Otherwise, coordinators have to keep track of all past changes indefinitely. This is more or less the question of when to rebase. More recent changes could be cached for users that expect less latency as they are actively connected, whereas users transitioning out of offline mode might expect more latency as the coordinator gets all the past changes to update the client on and do merging.

As I understand it, this connection-based approach is required for CmRDT (not that I really understand what a CRDT is due to lack of concrete examples) because it replays diffs and, as you mentioned, changes need to be kept around indefinitely. I can't see this being a good solution for most mobile apps...

@ubertao Please contact me because I am also using Couchbase Lite for syncing. I wanted to use Realm, but since didn't have syncing, I looked for another solution and found Couchbase Lite. I also have multi-user requirements and I would like to discuss how you're doing it in more detail. Sorry for using this post, but you didn't have any contact information on your github page.

:+1:

@everyone, I highly sugest you checkout Firebase if you’re waiting for syncing. It’s realtime and working with the API is super easy (just like Realm) - http://www.firebase.com

—
Sent from Mailbox

On Tue, Nov 10, 2015 at 5:32 AM, Luciano Nascimento
[email protected] wrote:

:+1:

Reply to this email directly or view it on GitHub:
https://github.com/realm/realm-cocoa/issues/913#issuecomment-155115522

The use case I hope to use Realm for sync is app centric. It's not about syncing between server and clients, but one or more client syncing data with some kind of cloud storage (Dropbox, iCloud, share network drive...etc). Like Core Data with iCloud, or Ensembles, or YapDatabase with CloudKit sync.

If I need client server solution, I can use Firebase, Parse, or hosting CouchDB myself. What I see missing here is a cross platform, backend independent, database syncing solution between apps.

@siuying 👍👍👍 this is so true!!! You nailed it down. Exactly that kind of thing I am dreaming about and I think realm is best for!

Agree with @siuying .
Please people at Realm give us some (hopefully good) news...

Tim from Realm here. We’re reading this thread with great interest and hopefully will be able to share some news soon, but it’s too early for us to say anything at this point. These use-case descriptions are very useful to help us refine our thinking, so please keep them coming.

I also agree with @siuying. I have a couple projects that are a great fit for Parse, but I'm planning a personal finance app for January which syncs between a user's iOS devices and Mac with CloudKit as the backend (i.e. personal sync without managing a server, less security risk for user & developer). I'm planning to use Realm for the local database, probably with https://github.com/BellAppLab/RealmCloudKit, but official support for CloudKit would be tremendous and it would help a lot of iOS developers to leave CoreData behind. Also, maintaining encryption support with the sync mechanism would be ideal.

:+1:

I'm specc'ing out a project where experts bring their iOS device in the field (where connectivity is sometimes challenging), take measurements, photos etc. which automatically sync with a server (via a REST like API) when connectivity is restored. Several experts can work in the same data simultaneously, so peer to peer sync could be nice in some remote cases (when the server is unreachable).

I'm looking at RestKit combined with Core Data, but 1) I can't get that to compile right now (some cocoapods/restkit/xcode 7 issues) and 2) it looks overly complex for this task.
For now I guess we'll roll our own syncing mechanism but would love to see it appear in Realm.

@basvk if you are looking for p2p sync, you might want to check couchbaselite/pouchdb

I would also love to see a p2p syncing concept like Ensembles (www.ensembles.io) for realm.

My use case is similar to @siuying . I am developing an android-only app, and would like to sync the realm db via the user's Google Drive app folder.

👍

:+1:

:+1:

👍

In light of Parse's announcement to shut down in a year, now might be a good time to share some details regarding your plans for sync. There are a lot of developers treading water in the sync market and it could benefit Realm adoption greatly if you made a public statement.

I second @blwinters statement.

+1 @blwinters

+1 @blwinters

I totally agree with @blwinters last comment. Given that Parse's service will shut down, I think you guys have a huge opportunity that you can't simply miss.

+1 @blwinters

Hi there! We were all very saddened to hear the news about Parse (they were batch mates with our founders in YCombinator Summer 2011). In general we don’t believe in raising your hopes with something that just isn’t done; we promise to only make announcements when products are ready.

In the meantime, sharing specific use-case details here (in addition to :+1:’s) is the best thing you can do to help us make progress towards sync in Realm. Many of us at Realm are monitoring this thread, and product feedback on GitHub has had direct impact in helping us shape Realm many times already. We want to build the most helpful solution, not the fastest one, and your comments really help.

For my use case, I'd love to see some way to pack up and send updates to a server with as little data marshaling overhead as possible. Ability to divide data up in to different synchronization "zones" would be useful, although multiple Realms might work. Support for building peer-to-peer synchronization would be interesting. It would also be great to see a hosted service like Parse or at least the ability for third party hosted services to support Realm. Even just primitives such as exposing a transaction log to help roll our own synchronization solution would be great.

I just recently learned about Realm and have not spent much time playing with it yet other than reading the docs. We have been seeking a great mobile + sync database for a few years. We built a prototype using Firebase and have also looked into Couchbase, but both of these solutions don't fit quite right.

Here's our use case:

  • Offline-first. Everything works as expected with no Internet connection. When a connection becomes available, sync continues.
  • Offline-only mode. For example if a user is not a subscriber, or the user simply does not want their data in the cloud (e.g. schools), then the database only works offline with no syncing.
  • Simply write to the local database and be notified of changes made to the local database while the database is syncing in the background (no explicit sync logic, other than to set it up)
  • Real-time sync. This could be done through a server if fast enough, or possibly peer-to-peer if available. Should sync the minimum amount of data (deltas) as quickly as possible.
  • Ability to share data between users (both as read-only and read-write)
  • Conflict management (whether from a single user with multiple devices or multiple users editing the same data)
  • Sync using native data formats such as Protobuf or Flatbuffers (not just JSON)
  • Backend that can be self-hosted anywhere
  • Data de-duplication for binary blobs such as images or PDFs. It would be great to have this both on mobile and in the backend. For example, if a user has multiple references to the same image, the image should only be stored once. For the backend, this could be cross-user as well.
  • Revision support. Ability to roll back to previous versions of data (such as revision history in Google Docs). Mostly likely the mobile database only needs the latest revision and would require an Internet connection to fetch previous revisions from the backend.
  • Android, Windows, iOS, (and Chrome/web?) clients

Here are our main concerns with Firebase:

  • Does not appear to work "offline first" or have an "offline-only" mode. The local datastore appears to be more of a cache than a real database.
  • Syncs using JSON
  • Does not offer a self-hosted option
  • The mobile database does not appear to be as nice/efficient as Realm

Here are our main concerns with Couchbase (I haven't done any first-hand testing):

  • Couchbase Lite for Android currently uses SQLite under the hood, although they appear to have recently fixed the 2MB storage limit issue
  • Couchbase is a JSON document store. Same concern as with Firebase. You can provide binary attachments, but I remember this not appearing to work well for our use case (maybe in regards to managing conflicts).
  • A Couchbase rep I spoke with assured me that it could be used for the real-time use cases I presented, but he said it wasn't specifically designed for that.
  • As an Android developer, using Couchbase Lite does not look particularly appealing. It appears to be a lot of getting/setting of plain Objects using String keys. Rather than, for example, annotating a POJO that is automatically de/serialized.

As an interim solution, it might be nice to use Realm as the mobile database for Firebase. I briefly started looking into this, but would appreciate hearing anyone else's thoughts on this.

!

My Use Case:

A. The app is location based using iBeacons and queries will be based on the iBeacons within range of the device and user information will be based on the Major and Minor values of the iBeacon
B. App will have 3 groups of users
--1. Group 1 will be able to access their own information and be able to edit SOME of it
--2. Group 2 will be able to access and edit Group 1's information and add and edit information to it
--3. Group 3 will be able to access Group 1's information but not add or edit
C. All groups will access and add/edit information from mobile devices and computers
D. The database needs to be updated constantly but would also be helpful if it could be cached to mobile devices for offline use with the most recent data available

:+1:

Another use case here:
Product: Multiplatform (iOS and Android) instant messaging app
Requirements - expectations:

  • Users can log in to their accounts on multiple devices.
  • All chats on all devices should sync when they have internet connection so that all chats and messages are available on all devices.
  • When user logs in to a new device, new device should be able to download the latest snapshot of a Realm (like a bulk update) and then continue its operations using the up to date Realm.

:+1:

👍🏻

:+1:

For my humble needs it would be great to have a backup of the data that could be tied simply to a user's account so if the user logs in to the app from a new phone or different device, it would just bring it down automatically. Synching changes across devices would be an absolute huge win as well.

But operating offline locally is the primary utility of Realm at the moment, though.

@timanglade Any news about sync feature?

:+1:

:+1:

Anybody have any experience with the IBM solution?

Offline-First iOS Apps with Swift & Cloudant Sync; Part 1: The Datastore

https://developer.ibm.com/clouddataservices/2016/01/25/start-developing-ios-apps-swift-with-cloud-sync-part-1-the-datastore/

@oyalhi I'm using Cloudant, but with CouchbaseLite on iOS and Mac. Works great.

@brendand May I ask why you have chosen CouchbaseLite rather than Cloudant for database services. I am looking for offline-first database then need cloud sync as well. CouchbaseLite does not seem to be offline-first? Realm would be a good choice once they complete; however, untill then I am looking for the right approach.

Both cloudant and couchbaselite are based on couchdb, which is a offline first database. Cloudant do have many extra features including fine grained authorization, full text indexing and hosted services. They are very different from regular RDMBS and Realm, so you might need to rethink whole apps around it.

@siuying I am just learning more about NoSQL. It is an interesting topic and I might just choose to go that way. I am still at evaluation and getting up to speed phase. Thanks for the info, I will look more into CouchbasLite. Most important aspect of my use case is offline-first, and online sync if and when I want to (RDBMS or NoSQL both seems fine).

@oyalhi Right. CouchbaseLite is definitely offline first. One of the beauties of a NoSQL / JSON document based engine is you'll never have to worry about database schema migrations again. Just add new properties to your model objects and be done with it. Let the engine take care of it for you. Syncing is really nice even with peer-to-peer networks and IBM Cloudant. I didn't use the Cloudant iOS SDK because I wanted to be able to offer my customers the ability to host their own Couchbase Servers for a workgroup setting. Plus at the time I picked CouchbaseLite, I didn't know about Cloudant and didn't know they had an iOS SDK. If you'd like to continue this discussion off this thread, just email me at brendand a-t gmail dot com.

:+1: Sync would be killer.
:+1: To those who say to implement an Ensembles-like solution. Essentially each client has their own full copy of the DB, and what's stored in the cloud (or on a NAS or whatever) is a mechanism of keeping the client copies in sync with each other. Ensembles uses a commonly-accessed timeline of "truth" that each client uses to keep its own DB in sync with the baseline. Changes are "played" on each client, and occasional baselining keeps things fairly efficient. This approach is nice because it's a "serverless" sync option, in addition to being effectively backend agnostic. Where the "truth" is stored doesn't matter, so it can be a local network filesystem, a Node.js server, Dropbox, iCloud, or whatever.

My use case would be multiple clients accessing shared data on a NAS filesystem. They may go offline and work independently, then come back and need to sync their changes with everyone else's.

Most business operations have critical data on their operations stored in databases. An employee user then creates a "view" of the data that relates specifically to their job function. Thus we "poll" the data to see how we are doing. With a Server based realm we would only have to poll once and then whenever we wanted we could look at the results and see how things are. A dynamic view would be huge.

We could be notified if the database changed in a way that would affect our view , exceed a target and the like. Instead of us looking at the data at our convenience, our apps could watch all the time and get us involved if there is something we need to decide on, like bring in more help if shipping gets behind etc...

Hi ! Here is my use case : I'm building apps with editorial content and I want my clients - who travel a lot - to be able to read content in their train or plane. Realm seems ideal, on the grounds that once they get a connection, it can always keep the existing content updated and fetch and store new content. I believe there is nothing very original here - nor things I couldn't do on my own. Just looking for a shortcut.
Firebase or Parse or Meteor joints would be great.

👍

👍

👍

👍

👍🏼

This might be somewhat orthogonal to the goal of this feature, but perhaps a good first step and useful extension would be something similar to NSIncrementalStore that allows us to fetch results remotely and populate a local realm as needed.

Hi,

I have coded with meteor working through tutorials etc and am really pleased to see that it has been integrated with react-native so I don't have to use cordova,

May I ask if it is possible to sync/link data from the meteor mongodb backend to an offline Realm db for full functionality in apps when in offline mode, for instance by adding functionality or integration with react-native-meteor npm plugin, this is because this plugin provides very easy to use DDP integration and meteor in general is a lot easier to work with though does not provide native environment unless combined with react-native,

https://www.npmjs.com/package/react-native-meteor

If so do you have any suggestions how this may be accomplished either now or in future features of realm-react-native?

Looking at https://realm.io/docs/react-native/latest/ this seems to be way things are going to go with react-native and also from what meteor team are saying that they will be integrating further with react on-going it would be a useful step to add the realm 'connection' into this plugin rather than a completely new solution, would appreciate any advice or feedback you may give,

Thanks very much

I use Xamarin for my mobile development and since I recently saw Realm supports Xamarin, I started to investigate it...it looks like a great improvement in most use cases however just as most of you - we need to support offline sync capabilities for a few of our apps. I see all the familiar players in previous comments, i.e. Couchbase, Parse, Firebase, etc. However, I have not seen any mention of Azure Mobile App (formerly Azure Mobile Services)...I can tell you this is what we use and it has been a great fit. What's interesting about Azure Mobile App's solution is that it can use virtually any local store for device side persistence. It comes with a SQLLite implementation...but I believe it could be replaced with a Realm version fairly straight forward (limited knowledge of Realm at this point so I could be mistaken). I mention this as a possible "hybrid" or stop gap solution for offline/sync while still using Realm on the device.

Anyone else have experience with Azure Mobile App using Realm? Any thoughts? Am I crazy?

They actually mention this implementation in their video....see link

https://realm.io/news/altconf-chris-anderson-making-your-mobile-backend-work-for-you/

That video seems to be an overview on Azure Mobile App services which I am well versed in - I was looking for feedback on using Realm with Azure Mobile App specifically...

Hey Everyone, we've made a first attempt at the syncing problem. We have a small library to handle the syncing, packaged in a node module.

Our solution is to wrap the create, update, and delete realm methods to provide additional logic for our syncing algorithm. It adds unique ID's to realm objects that need to sync and puts changes in a schema-less queue that can be synced remotely and resolved when a device needs to sync.

Again, It's just a start, but I think it can be helpful to anyone looking to get syncing to work in their apps. Would love suggestions.

https://github.com/KiteSync/realm-sync-js
https://www.npmjs.com/package/realm-sync-js

Our one line use case. We write apps that work in sh*thole internet connectivity zones. We needs offline first type applications and have the application not ever worry about how to get its data back to a server side component. Our current role yer own sync approach took a very long time and I don't want to repeat the process again for another application. Thank you

I've written several temporal databases in the past and one important thing with general purpose sync feature is conflict resolution strategy. Each use case will have different business rules that define the resolution strategy, which means a general purpose solution has to provide a pluggable API. This way, the developer can implement their business logic and configure the database to use it. What I did with my temporal database is to provide 2 built-in resolution strategies and expose the API. If the built-in works for them, they don't need to write one. If their needs require additional business logic, the developer implements it and configures the database to use it.

I'm against the majority; I think Realm should not have a sync feature and should instead focus on being a standalone database that can be used with any network or database solution.

The synchronization portion should be provided in 3rd party libraries, adapters that sit between the database and network layer in a mobile app. Something like feathers-ios that abstracts over both the networking process and local storage. That adapter can then leverage the power of realm on each individual platform and offer plugins (such as socket support) for varying needs. That way Realm remains smaller and the syncing functionality exists separate.

Does anyone else feel that way? I understand people want Firebase/Parse-esque solutions but Realm isn't really the same since it exists as a standalone component for the platform in question.

I agree with Brendan on his comments. Realm is a wonderful standalone
database and should continue being improved as such.

Syncing can be provided in 3rd party libraries or as a separate framework /
service from Realm.

On Wed, Jun 8, 2016 at 2:28 AM Brendan Conron [email protected]
wrote:

I'm against the majority; I think Realm should not have a sync feature and
should instead focus on being a standalone database that can be used with
any network or database solution.

The synchronization portion should be provided in 3rd party libraries,
adapters that sit between the database and network layer in a mobile app.
Something like feathers-ios https://github.com/feathersjs/feathers-ios
that abstracts over both the networking process and local storage. That
adapter can then leverage the power of realm on each individual platform
and offer plugins (such as socket support) for varying needs. That way
Realm remains smaller and the syncing functionality exists separate.

Does anyone else feel that way? I understand people want
Firebase/Parse-esque solutions but Realm isn't really the same since it
exists as a standalone component for the platform in question.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/realm/realm-cocoa/issues/913#issuecomment-224371157,
or mute the thread
https://github.com/notifications/unsubscribe/AAhN0wOflD2Yh7hPcXvjksVq3-d4lVixks5qJbhLgaJpZM4CiW7T
.

👍

Cool

@leotumwattana How would you go about handling syncing with a separate framework? The best solution for syncing is to sync only changed data like CoreData. Currently the only way to sync over iCloud is to sync the entire DB but that can cause issues where items are added on multiple devices.

@anthonycastelli Apple's CloudKit sounds like a possible reference? @mikejonas KiteSync solution looks interesting too.

I agree, we shouldn't try to sync the entire DB file. Too many issues there for corruption, conflicts, waste of network bandwidth and what not.

Correct me if I'm wrong, iCloud supports file syncing, but that's really meant for documents and not really for structured data such as Realm?

Apple tried the whole let CoreData magically sync your stuff for you method. It sounds great on paper, but it's such a black box. As @woolfel mentioned, conflict resolution is important and very domain specific.

What I'm thinking is maybe keeping Realm focused on persistence and letting another framework / service provide the transmission, cloud persistence and conflict resolution might be a better thing in the long run?

Cheers

Having sync be a separate library might sound simpler, it really isn't. It's trading for a different set of problems. At the core sync between copies has the same technical problems as distributed databases that employ eventual consistency. It's actually worse, since the delta between any 2 copies could be large, the storage space is limited, network is unreliable and time between sync isn't guaranteed. distributed db like mongodb, riak, cassandra, hbase, etc all try to sync regularly to reduce dirty reads, performance bottleneck and improve responsiveness.

To do sync properly, there's only 2 reliable approaches i am aware of. The first is to keep a commit log and each record has a value hash. Second is to use a temporal database that supports branching. Commit log approach works well for use cases where Last Write Wins is a good fit. For conflict resolution, temporal db is a better fit. Think about how people do fork merge with GIT. Any time a person wants to create a pull request on an active project, they have to rebase and manually merge the conflicts. Instead of files, we have rows that may have conflicts.

Having realmdb support it natively means fever people have to reinvent a commit log and it makes it easier to repeat. On the modeling side, there could be another base class like SyncObject that has the necessary fields. the persistence can easily see if the data needs a commit log entry and handle that. Just some things to consider

_It appears that Realm’s developers have been struggling to grasp the requirements of "sync"._ As you can see from this multi-year thread, getting everyone to agree what this might look like is pretty much impossible. If merely defining a sync feature that covers most use scenarios is difficult, then imagine the implementation. This may not be the best path.

I have been in and around the database business for 30 years (as a product provider and on the customer side) and a few thoughts to share in this regard.

First, a faster SQL Lite clone with SQL Lite sync is not the answer. It's not much of an improvement and the majority will continue to use what they know.

Second, some leadership is required to move forward. The obvious source of that leadership should be Realm.

Finally, perhaps what is needed are some design patterns for syncing objects of known source, key and version. Something like a generalized Object subscribe and publish hub comes to mind here. This becomes a simple matter if Realm accepts and emits Objects in a published manner via a well-defined subscribe and publish Interface Definition. That way, projects can spring up that implement the interface. This changes the question from _“How does Realm do sync?”_ to _"How do you sync Realm to your back-end database?"_

As an example, suppose a potential Realm user has a large legacy relational database and wants to use Realm on the client side. They most certainly do not want to sync everything to all devices. They want to filter objects and limit property projections in manners we could never know. Don't give them "Realm for Oracle". Give them a well-defined Interface Definition, supported by Realm and some best practice design patterns. These interfaces can then be implemented and used anywhere, e.g. Realm Server (non-existent but an interesting thought), SQL Server, MySQL, Oracle, Azure Tables, MongoDB and whatever else suits their fancy; maybe even those ugly old mainframe databases.

_In other words, don't solve the world’s problems! Just do what you do and facilitate the integration of Realm into the existing world. Let the market determine what that looks like._

Cheers, Tom Hebert, ObjEx, Inc.

I agree this is a complicated problem. In my opinion, the fundamental problem is one of correlating or mapping the client model to the server model. Realm has done a great job isolating the model on the client. So we need the ability to map the Realm client model to the server model.

We've solved the mapping problem here at Splicer, which enables "syncing" Realm db to structured backend data. This is handy for integrating mobile apps with legacy databases as Tom describes. Our solution also allows you to “sync objects”, “filter objects”, and “limit property projections”. And all of this happens using typesafe objects similar to Realm.

BTW, we are looking for developers to participate in our closed beta. Please email [email protected] if you are interested.

Aaron Evans
Founder of Splicer
http://splicer.io

For my own app (using the Xamarin version of Realm), I have been investigating how to synchronize data between multiple devices.
My requirements:

  • each device (smartphone, tablet, PC) must have it's own local (full) copy of the data
  • each device can contain multiple databases, not all databases are synced in the same way. e.g.: database A is synced with device d1 and d2, database B is synced with device d2 and d3. So the sync mechanism must be customizable per-database.
  • 75% add, 10% remove, 15% edit : a data set that mostly grows over time
  • devices are mostly offline, sync is done when the app has network access
  • devices have p2p connections with each other and a connection to a central server

A recently published paper called "A Conflict-Free Replicated JSON Datatype" (https://arxiv.org/pdf/1608.03960v1.pdf) caught my attention, it offers a generic solution focused on the data. The algorithm works client side on JSON data and does not require network ordering guarantees for sync.

In my opinion, an ideal implementation would have sync support in the realm core and would expose an interface allowing developers to subscribe to modification events (e.g.: a send/receive queue for 'operations', as described in the paper.) This allows developers to implement their own network layer and keeps the focus of Realm on the data itself, not on the infrastructure to exchange the data between devices. Third party libraries and services could fill the gap providing support for p2p/central server/... sync .

@bmotmans that approach should work fine for cases where "commit log" style CRDT works well. For situations where conflict resolution is needed commit log style CRDT might not be a good fit. There's lots of examples where CRDT is used like Mongo, Cassandra and most noSql databases. To do arbitrary conflict resolution, you actually need the history of the versions. I agree Realm should have some of this baked in, so that developers can focus on the network part without worrying about building their own commit log + sync. Building a bullet proof commit log that follows established CRDT theory isn't trivial or easy.

To follow up on my comment from a few months ago:

It seems that a lot of developers opt for the easy choice over the "correct", which is understandable, considering real-world pressures and deadlines. If a product existed that acted as a mobile database and a sync engine with built-in conflict resolution, that would be incredible, quantified by the little amount of work I'd have to do to integrate that functionality into my applications. Without diving into the internals of such a framework, the sell value to developers is quite high, offering high parity features one would normally have to spend a not insignificant amount of time creating.

That being said, libraries and frameworks, especially something as wide-reaching as Realm, should be designed with the simple made easy doctrine in mind. Ideally, Realm would be a database with no fluff or flair and other desirable features such as notifications, syncing, etc would exist as a set of composable plugins to the base system.

Hi Everyone,

Again, thanks for chiming in and waiting earnestly on our development. If any of you are still waiting to hear more about "sync" please feel free to reach out to me personally.

My email is zayn.[email protected]

Cheers.

Hi,

I’m very very interested. Are there any news on this? 🙏

—
Massimo Biolcati
www.massimobiolcati.com

On Sep 8, 2016, at 1:24 PM, Zayn Sagar [email protected] wrote:

Hi Everyone,

Again, thanks for chiming in and waiting earnestly on our development. If any of you are still waiting to hear more about "sync" please feel free to reach out to me personally.

My email is zayn.[email protected] mailto:[email protected]
Cheers.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/realm/realm-cocoa/issues/913#issuecomment-245672923, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOr1yhcKjNznZE7gFRwdP-L8IS7Ec_Cks5qoES0gaJpZM4CiW7T.

👍

Hi everyone! In case you missed it, today we announced the Realm Mobile Platform, our official product for synchronizing Realm data between devices.

https://realm.io/news/introducing-realm-mobile-platform/

Thanks again to everyone who expressed their interest in this issue. Please give Realm Mobile Platform a try and tell us what you think! :)

Was this page helpful?
0 / 5 - 0 ratings