Slate: Provide a way to verify that two schemas are identical

Created on 10 Oct 2018 · 9Comments · Source: ianstormtaylor/slate

Do you want to request a _feature_ or report a _bug_?

Request a feature / discuss an idea.

What's the current behavior?

It would be nice to have a way to check that two Schemas are identical, even if they're not in the same JS process. Maybe through some kind of hash / signature?

I ran into this a while ago investigating collaborative editing. Two clients can perform operations that each result in a valid document, but turn the document invalid when those changes happen simultaneously. For example, each client simultaneously deletes a different paragraph in a two-paragraph document -- each delete is OK, but combine them and you end up with a document with no nodes.

The easiest way I found to solve this problem was to have each client (and the server) silently normalize the document after applying their operations. To keep things consistent, that means that every client / server needs to normalize documents in the exact same way. We can't necessarily _enforce_ this, but if it was possible to detect when a client was out of sync, it could try to figure out a way to handle it -- reloading the page, maybe, or showing an error and blocking edits.

Anyway, that's a weird case, but during the discussion on Controllers, it seemed like it would be more possible for Schemas to get detached from a Value. This might help better keep track of them.

discussion

Source

justinweiss

All 9 comments

@justinweiss thanks for bringing this up!

The easiest way I found to solve this problem was to have each client (and the server) silently normalize the document after applying their operations.

Side note, but why does the server also need to be normalizing? I would have expected that the server was unaware of the schema, and just applied whichever operations came in from the client. (Although maybe that leaves open a "vulnerability" to accepting clients blindly?)

It would be nice to have a way to check that two Schemas are identical, even if they're not in the same JS process. Maybe through some kind of hash / signature?

With 0.42 we've moved further from schemas actually being "objects" in their own right, and now they're much closer to just configuration that results in plugin middleware. I was thinking of going further in this direction over time, potentially even making slate-schema a plugin itself that is just a convenient way to declare them, but not even in core any more.

I'm not sure were ever going to be able to get nice hashes out of schemas to compare them.

This might be something where we have to have versioning with version numbers, similar to how an API might be versioned.

ianstormtaylor on 10 Oct 2018

👍1

That makes sense, thanks!

Side note, but why does the server also need to be normalizing? I would have expected that the server was unaware of the schema, and just applied whichever operations came in from the client. (Although maybe that leaves open a "vulnerability" to accepting clients blindly?)

At some point you probably want to keep a recent snapshot server-side, so you aren't rebuilding a Value from operations from the earliest point in history -- it's easiest if the server can keep a snapshot up-to-date as it receives operations. If the server never runs the operations to normalize a document, it will start receiving operations that refer to paths it doesn't have. For example, in the situation where both paragraphs of a document are removed, it won't be able to refer to any path, until it sees the next insert_node. So if a client inserted text into the new paragraph that was created by normalization, that insert_text operation would refer to a path that the server didn't have.

You could get around this by each client, as it normalizes, sending the normalizing operations to the server, eventually reaching each other client. In that case, though, you could end up with clients simultaneously normalizing in the same way, and sending duplicate operations. You could have two clients each delete a paragraph, and end up normalizing to N paragraphs :-)

I could absolutely be missing a better option, though!

justinweiss on 10 Oct 2018

Ah interesting. I feel like I’d go with the approach of letting clients result in the n normalizations.

Do you only normalize server-side? Or do you have some way of identifying and not sending the normalizing operations to the server? Otherwise with latency don’t you have the same issue still but just with one extra client?

The edge case that scares me most is in the infinite normalizations loop. Does your server-side logic somehow prevent that?

ianstormtaylor on 10 Oct 2018

You can do different things in different phases:

Before sending a set of operations to the server, you can normalize the Value normally, and send those normalizing operations the way you'd normally expect.
When the server is finished transforming simultaneous operations and is ready to commit them, it can run normalization on the Value, and append normalizing operations to the saved operation (before distributing that operation + normalizing operations to other clients).
If a client receives that set of operations without doing anything else, the resulting Value should be valid. But if it had simultaneous operations that the server hadn't seen yet (had to transform client side), it should also silently normalize.

The idea I came in with is that if every document is eventually the same on all clients after all transformed operations have been applied, then every document normalizing themselves should also be the same on all clients.

Is infinite normalizations still a problem? I don't think I've ever run into it, except for when I would write normalizations and forgot normalize: false. But I also don't tend to use them super frequently.

justinweiss on 11 Oct 2018

Is infinite normalizations still a problem? I don't think I've ever run into it, except for when I would write normalizations and forgot normalize: false. But I also don't tend to use them super frequently.

The local-only problem is gone, but I think there is a remote normalization edge cases, where you have two clients with different versions of the schema present. If one had a schema which required zero nodes in the document, and the other required a single node, they'd normalize back and forth forever until one disconnected.

(The real world edge case is going to be something much more subtle than document size, but it could be anything that happens to result in back-and-forth normalizations across schema versions.)

I think this would need to be solved by diligently versioning the editor, and then erroring out for clients (with some better UX) when they are sending operations on a previous version, and require a refresh. I think this is how Dropbox Paper works when new versions of its servers are deployed.

ianstormtaylor on 11 Oct 2018

👍1

Closing, since I think versioning will be good enough for the specific case I was thinking about.

justinweiss on 14 Oct 2018

Would it be distasteful to have the server broadcast the schema that it expects connecting clients to use?

CameronAckermanSEL on 6 Nov 2018

@CameronAckermanSEL nope, that sounds like the way to do it. On connecting to the server they get some schema ID (however you want to define it) and send it along with their operations. And if the server updates the schema, it knows that certain clients are sending "old" operations and can tell them to reconnect/refresh or something.

ianstormtaylor on 6 Nov 2018

👍1

Yep -- my plan is to broadcast a "compatible version" when the client handshakes with the server, and display an error message / disable the editor if they don't match. Because I was also thinking that there are many other things that could make the client and server incompatible -- possible changes in how Slate applies operations across different versions, for example.

justinweiss on 8 Nov 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

backspacing to the start of an inline node errors

ianstormtaylor · 3Comments

Slate data fragment merging leads to extra line breaks

Slapbox · 3Comments

triple-click to select block, selection extends to next block

AlexeiAndreev · 3Comments

operations need to normalize all ancestors

ianstormtaylor · 3Comments

Adding Headers and Footers

JSH3R0 · 3Comments