Slate: Store paths instead of keys on selection data for `set_selection` operations

Created on 26 Jan 2018  路  15Comments  路  Source: ianstormtaylor/slate

Do you want to request a feature or report a bug?

Feature.

What's the current behavior?

Currently, set_selection operations store some selection details in the selection property, and other details in the properties property. selection is a Range, and only stores keys, not paths.

In order to support transforming selections against other operations, we need a way to store anchorPath and focusPath on Range -- either instead of, or in addition to, the key.

Here's what I'm trying to do with transformations:

  1. Convert operation.selection into an plain JS object
  2. Turn selection.anchorKey, selection.focusKey into anchorPath, focusPath
  3. Merge properties with my plain JS selection object (because properties might only contain the difference from selection)
  4. Transform my selection JS object against the incoming op
  5. Transform properties against the incoming op
  6. Turn the new anchorPath, focusPath on the selection object back into keys
  7. Attach the transformed objects back onto the set_selection operation

The problem happens in Step 6. In Step 2, to turn the original anchor/focusKey into a path, I can use the document on operation.value. But in Step 6, those transformed anchor/focusPaths only apply to the new document.

This usually isn't a problem -- you can get the new state by Operation.applying the incoming operation to the selection op's value. But the keys you get from that value are going to be different than the keys that eventually show up in the document, even though the paths will be the same. For example, if you call

value1 = Operation.apply(value, split_node_op)
value2 = Operation.apply(value, split_node_op)

the new node might have a different key in value1 and value2, even though it's the same node in each document.

The only way I can think of to fix this would be to have Range allow you to use paths, and either allow you to also set key, or have key be a calculated property based on path and value.

What do you think? I'm happy to take a stab at a PR, if we agree on a direction.

What's the expected behavior?

In a collaborative environment, keys can appear and disappear, but paths can be transformed and stay valid. In order to support both collaboration and undo / redo, set_selection needs to store both the old selection and the new selection as paths, instead of keys.

discussion

All 15 comments

Hey! I was just looking to do the same thing for #1402 (trying to serialize the operation history as JSON and not as ImmutableJS objects with memoized cached values). It seems like the selection operation should serialize it's selection property / Range, and when #1408 is implemented, theRange object will be switched to always having paths?

Taking another stab at solving this, I don't think it needs to serialize its Selection in order to be consistent with other operations.

What it _does_ need, though, is a way to store anchorPath and focusPath on Range -- either instead of, or in addition to, the key.

Here's what I'm trying to do with transformations:

  1. Convert operation.selection into an plain JS object
  2. Turn selection.anchorKey, selection.focusKey into anchorPath, focusPath
  3. Merge properties with my plain JS selection object (because properties might only contain the difference from selection)
  4. Transform my selection JS object against the incoming op
  5. Transform properties against the incoming op
  6. Turn the new anchorPath, focusPath on the selection object back into keys
  7. Attach the transformed objects back onto the set_selection operation

The problem happens in Step 6. In Step 2, to turn the original anchor/focusKey into a path, I can use the document on operation.value. But in Step 6, those transformed anchor/focusPaths only apply to the new document.

This usually isn't a problem -- you can get the new state by Operation.applying the incoming operation to the selection op's value. But the keys you get from that value are going to be different than the keys that eventually show up in the document, even though the paths will be the same. For example, if you call

value1 = Operation.apply(value, split_node_op)
value2 = Operation.apply(value, split_node_op)

the new node might have a different key in value1 and value2, even though it's the same node in each document.

The only way I can think of to fix this would be to have Range allow you to use paths, and either allow you to also set key, or have key be a calculated property based on path and value.

What do you think? I'm happy to take a stab at a PR, if we agree on a direction.

But what if the marks in selection are not serializable?

@zhujinxuan That's an interesting point. I think right now I'm fine if selection isn't serializable -- it's not being able to speak in terms of path that's causing a bigger problem. I'll update the first post and the title to make this clearer. Do you have any thoughts on my comment from earlier today?

I am thinking about that, though my solution is a bit awkward:

  1. at first stage, restore both path and key
  2. if getPath find not the right key, get by key, and throw a warning

If everything of undo is good, then the path should be no problem. But I am not confident about undo in other operations... By the warnings, we can know the bugs of other operations and find out whether undo is a good choice.

merge_node stores the position it will be merged into, so we wouldn't have to store text -- for example, if you were merging a text node into the previous text node, and the previous text node had a string of length 20, that merge_node operation's position would be 20. So you can do the calculation on numbers, instead of needing the text.

We would have to adjust the path when applying operations, in a similar way as merge_node is currently adjusting key. The biggest difference would be that if I'm merging a parent of mine into a previous node, I wouldn't have had to adjust key, but I would have to adjust path. This is true for almost all of the node-level operations, we'd have to do more cursor path adjustments, because more things affect the path than affect the key. (I already have a lot of these written, but it would be nice to have them inside Slate!)

@justinweiss not totally sure I follow, but I would absolutely welcome a PR that changed to represent ranges with paths instead of keys, if that's what you mean!

Cool, I'll take a look once I get some time.

What I mean by needing to do more cursor path adjustments:

With keys, if your selection is at path [2, 0], and you're merging path [1], you don't have to change anything. Your key stays the same. But if it's a path, your selection is now pointing at [1, 0]. So, it's more work, but I already have most of that figured out.

https://github.com/ianstormtaylor/slate/pull/1591

I find the path solution is not good. SlateJs memerize the result of getAncestors(...). If we use path, we cannot take advantage of the __cache and memoize in src/utils/memoize.js

@ianstormtaylor I tried to replace all updateNode with updateNodeAtPath. However, without the speed-up of your memoize function, the operations are slowed down. Then, I think the set_selection by path is perhaps not as good as it looks

@zhujinxuan I don't think that experiment alone tells us that using paths isn't a good solution. I think we'd be able to optimize it further, and we'd probably implement it slightly differently. And the "keys vs. paths" debate would affect many more code paths than just updating nodes, so we'd need to factor that in when talking performance. (If you weren't suggesting that, my bad, feel free to ignore!)

@ianstormtaylor It may sounds awkward, but I am thinking about whether the following solution is a good method:

  1. Provide isFast('methodName', key), for example isFast('getClosestBlock', key) This method can tell whether __cache has already get the cached result.
  2. Provide methods like {methodName}ByKeyOrPath, for example updateNodeByKeyOrPath that
  if (isFast(methodName, key)) {
     return this[methodName](key)
  }
  const cacheNode = this.getNodeAtPath(path);
   if (cacheNode && cacheName.key === key) {
       return this[methodName + 'AtPath'](path);
   }
   return this[methodName](key)

@zhujinxuan maybe that could be interesting down the road, but I think the first thing would be switching to paths and seeing if there are any places that get much slower. I still think (pure intuition, no data) that it'll end up being much faster overall. A few places might get slower but I don't think we'll notice them much, and we can handle them one-off.

@justinweiss I think overall I'd rather not try to do more funky stuff with paths sometimes and keys sometimes, and instead just pull the trigger on switching to paths as the primary references. And then keys can remain purely an in-memory thing for convenience while updating things I think. Would make OT stuff a lot easier if we could just completely ignore keys I think? (This is easier said than done though which I realize 馃槃 since it's a decently big change for someone to make.)

Yep, for OT I'm already completely ignoring keys 馃槃

@justinweiss did you make progress with this? This seems to match the issue I ran into related to https://github.com/ianstormtaylor/slate/issues/1873

Kind of. I have an implementation that's really closely tied to how I'm doing collaborative editing, but I haven't done any of the work yet to pull it out into Slate itself.

Was this page helpful?
0 / 5 - 0 ratings