Ckeditor5: Keep all elements in the `document` a.k.a. One API to rule them all

Created on 6 Mar 2017 · 19Comments · Source: ckeditor/ckeditor5

tl;dr: Newly created elements should be immediately added to the special root in the document to handle them the same way as we handle inserted elements.

I was thinking about it before the weekend and I still think it is a good idea, so I'm reporting it.

Problem

The idea arises reading an article about the flux architecture, where model changes only by the one change entry point. For sure, we are far from flux, but the engine has a similar concept: only operations/deltas, change model. Right? Well... not exactly. During the view to model conversion, we build a big part of the model without operations and then insert it into the model using one insert operation.

This is why we have overcomplicated API of inserting, with:

useless for the end user element.insertChildren(),
batch.insertChildren() which should be used if element is in the model,
writer.insertChildren() which should be used in conversion,
and some other API level, like InsertOperation with no real use-case for the end user.

See https://github.com/ckeditor/ckeditor5-engine/issues/678, https://github.com/ckeditor/ckeditor5-engine/issues/738.

Also, recently working with @oskarwrobel on https://github.com/ckeditor/ckeditor5-engine/pull/845 we realized that DocumentFragment (detached from the Document) can't contain markers collection, because makers are based on LiveRanges, LiveRanges needs change event to work and the change event is based on Deltas and Operations, but to modify document fragment you use writer, not batches. It also means that you can not create a LiveRange on the detached document fragment.

Now it's clear, but it was not obvious for me nor for @scofalik. Considering that we created this code, I can only imagine how complex it would be for others. And note that we are not talking about some low-level API, but about inserting children into an element and creating ranges.

Solution

The solution would be to insert everything into the document immediately after creating an element. Even now we have $graveyard where we move removed elements. We could put there, or to the similar root, any element (created, but not yet attached to the real parent).

You would create elements in the very similar way as in DOM, using: document.createElement. Then the batch/deltas API would be the only proper way to insert element. We could hide other methods and tell all users to use one, high-level API for the tree modifications.

It means also that markers and live ranges will work on every document fragment, what is nice.

Disadvantages

There are few.

First, we need 1 more (move) operation every time we add an element to the tree. We could make an exception for text nodes, allowing inserting texts directly into the tree as children, passing a string as a child.

Second, the compression during upload will be most probably less efficient, if we will send every insertion separately, but if it will be a problem we may not sync elements in the special root with new element because they will not be shared between users anyway. It might be tricky, but possible if needed.

Finally, garbage collection of these elements will be more tricky than for not attached elements, but:

it should not be very common that created element will not be later inserted into the document,
we need to work on the garbage collection for the graveyard anyway, so it could be handled together.

Summary

To sum up, I think that, event if the mechanism will be more complicated, it worth to have one mechanism, one API and all tree modifications working the same way, does not matter if an element is in the document or not.

engine

Source

pjasiun

Most helpful comment

One of the things which we need to fix here is that it should be possible to implement most of the simple features without importing anything from the engine. The goal is to unblock implementing features for existing builds.

This will also allow us to import certain build from CDN in tools like JSFiddle. Then we will be able to prototype features and examples to community questions in a convenient way.

szymonkups on 18 Oct 2017

👍3

All 19 comments

How does this solve an issue of the possibility to have LiveRange in DocumentFragment? DocumentFragment would have to be a special element, and might be created by document.createDocumentFragment() (this is not needed really, but may feel better than document.createElement( '$documentFragment' )).

How does this solve an issue of needing to use model writer? All algorithms from the model writer would have to be moved to other place anyway (I wouldn't be surprised if that was the place from which we extracted them in the first place). Not to mention that this does not have an impact on view writer.

How does this solve overcomplicated API? To me it looks like you want to simplify API by merging some methods together. We don't have overcomplicated API as long as we expose only the highest level to ther user. All that user needs to know now is:

Batch API to write features,
writers to write more complicated converters.

I don't see how those changes present a simpler API to the user. We will still have all the things that you mentioned no matter if we do the change that you described.

What I think is that:

creating a special root for newly created elements is a terrible idea, that brings complication just in the other place,
it's twice that many operations. twice that big history,
changes you proposed can be achieved without introducing that special tree.

We don't need a new tree to have document.createElement() and document.createDocumentFragment(). However at that point I don't see a reason to use document.createElement() instead of new Element() but I guess it looks better and we might need it in future. At least it links the element with the document instance and maybe that could be used for something.
We can change a bit LiveRange and DocumentFragment to enable LiveRanges on DocumentFragments. The first step would be to create DocumentFragments using the Document so they are linked together. DocumentFragments are already very similar to roots and we can have Positions in them and TreeWalker works properly in DocumentFragment. In fact, if DocumentFragment is linked with the document it is probably just a normal RootElement at that point. So maybe we should just get rid of DocumentFragments instead?
Synchronisation of DocumentFragments is a broad topic actually, and a few ideas come to my mind - we could either sync them, or not (but then we need some magic when inserting stuff from DocumentFragment to the document).
How are you going to use Batch API after those changes? You will have to move stuff from DocumentFragment to the document. It seems weird and in most cases you will want to insert the whole document fragment anyway. And in fact, moving from such document fragment would have to be treated as inserting, probably batch.move() would have to check it and create ReinsertDelta instead of MoveDelta. Keep in mind this is same no matter if we use the new root or not.
API - as I already explained, everything that is not important for users can be hidden from them. We just have to make good tutorials and documentation.

So this is my take on the issue.

Should we do it?
I think not.

What I see here is that @oskarwrobel had a problem with "view stamps" to markers conversion and you really wanted a possibility to make it easier to create LiveRanges when creating a view. This is the only problem that all of this solves right now. I am sure that it will be less time to create a nice solution for this problem, instead of spending time on big refactoring and probably introduce some problems that we don't see now.

scofalik on 6 Mar 2017

I will sum up my post again because I was changing my mind a bit when writing a post and now after talking with @pjasiun.

I agree with (real) problems stated in the original post:

Having two APIs to create model (Batch API, and model.writer).
No way to create LiveRange in DocumentFragment.

I just think that:

we should do it with as less messing in code as possible,
we should think whether we need it now (what current problems does it solve).

scofalik on 6 Mar 2017

When we will have just one API we could go further with changes:

rename methods like element.insertChildren to element._insertChildren and make them protected,
make model.wirter private tool for operations and transformation and rename it to model.operation.utils.

Then we will have only one public API. We could make it nicer. I think that we could even inject it to object so, instead of batch.insertChildren we will have element.insertChilden, to keep them where the user would look for it. Such method will create delta and do whatever is needed to create a proper change. Note that to do it we need to inject methods because they need to know about ranges and positions and element file can not include position/ranges because of the circular dependencies. But that's fine for me, even from the architectural point of view it makes sense to keep all API methods which creates deltas in one place (these methods will not change elements directly, but make such feeling).

But hey! We can do more. Usually, you add changes the recent batch. You may want to create a new batch, but it's an edge case that you will want to apply something to the historical batch (upload is the only case I can imagine right now). So instead of handling batch manually we could have document.currentBatch which will be used by these methods. If you need to apply changes to any other batch you need to change document.currentBatch. This way we will make common changes much simpler.

So, with all of these change, everything what developer will need to do will be:

document.newBatch(); // to start a new undo step

const pElement = document.createElement( 'p' );
divElement.appendChildren( pElement );
pElement.setAttribute( 'foo', true );

To improve performance document.createElement could accept parent as a second parameter so we will have an element create without additional move operation.

pjasiun on 7 Mar 2017

Don't get me wrong. I really like to current API, it's very nice... for us. It does exactly what methods say and there is the 1-1 relation between architecture and the API, no magic.

But then, I imagine that someone asks on the StackOverflow: "How I can insert a child element in the CKEditor5?"

And what we need to say now is: "It depends. Do you want to modify an element which is added to the document or not? If it's added you need to use Batch API, if not you need to use model writer. But remember that in the second case you can not use markers, LivePositions and liveRanges."
I can imagine that user may have no idea which API he should use in his case.

After proposed changes, we will be able to simply say: "Use document.createElement and element.appendChildren. And note that if you want to keep these changes in the separate undo step use document.newBatch(); before changes". It sounds much simpler.

pjasiun on 7 Mar 2017

I just got the question from @oskarwrobel:

Is it possible to use batch.insert on the detached document fragment?

If the experienced CKEditor 5 core developer needs to ask such question I can only imagine how confusion it will be for the community.

pjasiun on 9 Mar 2017

One more advantage of such change would be a separate change:insert event for each inserted element. Now, when I want to check if image element was inserted I need to use walker or dispatcher because a single change:insert may contain multiple elements.

pjasiun on 28 Mar 2017

Interesting. This would help with https://github.com/ckeditor/ckeditor5-list/issues/54 I presume.

Reinmar on 28 Mar 2017

Once again we came to the conclusion that this refactoring is needed. By hiding methods like insertChildren() we'll hide the need to use indexes. Instead, in 99% of the API the developer will work with offsets which will avoid the confusion.

Reinmar on 10 Jul 2017

👍1

One of the things which we need to fix here is that it should be possible to implement most of the simple features without importing anything from the engine. The goal is to unblock implementing features for existing builds.

See e.g.: https://github.com/ckeditor/ckeditor5/issues/555#issuecomment-336979305

So, we should have create*() methods for all the things in the model and check what else might be a problem.

Reinmar on 16 Oct 2017

One of the things which we need to fix here is that it should be possible to implement most of the simple features without importing anything from the engine. The goal is to unblock implementing features for existing builds.

This will also allow us to import certain build from CDN in tools like JSFiddle. Then we will be able to prototype features and examples to community questions in a convenient way.

szymonkups on 18 Oct 2017

👍3

I've been again thinking today about the problems with creating instances of various classes and our use of the instanceof operator... We either need to solve this situation using the create*() methods or get rid of the instanceof. However, the former solution is much nicer because it will allow us to reduce the number of imports a lot.

I see this topic as one of the most important issues with the engine (and the whole architecture) right now. It repeats in things like commands and UI, but is most serious here.

Reinmar on 19 Oct 2017

Notes from today's meeting:

Batch vs node's API – methods for changing the model should end up as batch API. Otherwise, we'd have serious problems to implement "applying attr to a range". Also, if we had all the methods in nodes then you'd have to pass the batch instance there anyway. This reduces the mental connection with the fact that changes are grouped in batches.
We were discussing the problem that if all created nodes will be automatically retained by the document, there may be memory leaks. Some algorithms are, for instance, creating intermediary nodes while performing some operations. Those algorithms need to be detached when needed. We can't make GC clearing them up automatically because we don't know when they are not needed anymore. Instead, we should expose detach() methods. We understand that it will be used only by developers aware of potential problems. However, potential memory leaks should be rather small so it's not a big deal.
We also need to clarify this situation in the view. There's the writer and node's API. The node's API needs to be hidden to match changes in the model.

Reinmar on 20 Oct 2017

After reviewing methods we have now on all API levels, I believe that this should be the set of methods we should expose:

batch.createText( text, attributes );
batch.createElement( name, attributes );
batch.createDocumentFragment();

batch.insert( item, itemOrPosition, offset );
batch.insertText( text, attributes, itemOrPosition, offset );
batch.insertElement( name, attributes, itemOrPosition, offset );

batch.move( range, targetPosition, offset );

batch.append( item, parent );
batch.appendText( text, attributes, parent );
batch.appendElement( name, attributes, parent );

batch.remove( itemOrRange );

batch.rename( element, newName );

batch.setAttribute( itemOrRange, key, value );
batch.setAttributes( itemOrRange, attributes );
batch.removeAttribute( itemOrRange, key );
batch.clearAttributes( itemOrRange );

batch.setMarker( markerOrName, itemOrRange );
batch.removeMarker( markerOrName );

batch.split( position );
batch.merge( position );
batch.wrap( range, elementOrString );
batch.unwrap( element );

Pair itemOrPosition + offset should works like in position:
https://github.com/ckeditor/ckeditor5-engine/blob/2924d529657241d926804aff9306f8a01e4ef0af/src/view/position.js#L315-L317

We should remove batch.register, making them batch methods.

Batch methods should be smart and useful, instead of being deltas contractors like they are now. The good example is current batch.weakInsert. It should be used always when text is inserted, but now both autoformat and link plugins use batch.Insert, what is incorrect. Instead of letting developers do such mistakes we should have one batch.Insert method which uses WeakInsert delta when text node is passed.

Note that in the new approach:

create method will really insert element in the $incubator root (or in the document fragment in $incubator if it is a text to prevent texts merging),
insert method will move the node to the target position.

Note that it is more efficient to use insertElement or insertText to insert node directly in the target position.

Document fragment will be now inherited from the element. It will be a special type of an element with no attribute, $incubator as a parent and $documentFragment as a name.

insert document fragment to any other parent than $incubator will move its children and remove document fragment.

Methods to remove from the public API (make them protected mostly):

node.setAttribute
node.setAttributesTo
node.removeAttribute
node.clearAttributes
element.clone
element.appendChildren
element.insertChildren
element.removeChildren
text.clone
text.data (setter)

pjasiun on 16 Nov 2017

create method will really insert element in the $incubator root (or in the document fragment in $incubator if it is a text to prevent texts merging),

In fact, we should change the insert operation not to merge text nodes if they are inserted in the $incubator. It will prevent let us avoid creating additional document fragments.

pjasiun on 16 Nov 2017

We realized that assumption that all created element will land in the document finally and we can already create them in the document (in $incubator), was wrong. When one call getData part of the model is copied and transformed. Such changes should not be part of the model, increase document version nor land in the history.

At the same time, we agreed that operations can be used to modify any model structure, it does not need to be a part of the document. This is why:

we keep the idea of the single API but give up the concept of $incubator, letting operations modify detached elements or document fragments,
document fragments should not inherit from elements anymore,
changes on elements which are not part of the document should not update document version nor be saved in the history.

pjasiun on 20 Nov 2017

If we want to keep document read-only, we should also make document.createRoot protected and introduce batch.createDocumentRoot( doc, [ elementName ], [ rootName ] ).

pjasiun on 20 Nov 2017

Also, all operations need to have root property, which should be checked here: https://github.com/ckeditor/ckeditor5-engine/blob/fc7da8035cdc9533f66ad5de66711e25cd2b385e/src/model/document.js#L166-L168

Changes on detached roots which are not document roots (graveyard or editing roots) should not increase version nor be added to the history.

pjasiun on 20 Nov 2017

Also, since create* methods do not need batch to be defined, they can be moved to document, for now:

document.createText( text, attributes );
document.createElement( name, attributes );
document.createDocumentFragment();
document.createRoot( [ elementName ], [ rootName ] ); //as it is now

Later, they will most probably land in the model.writer, see https://github.com/ckeditor/ckeditor5-engine/issues/1186.

pjasiun on 20 Nov 2017

Close via https://github.com/ckeditor/ckeditor5-engine/pull/1203. Details will be handled in follow-ups.

pjasiun on 19 Dec 2017

Was this page helpful?

0 / 5 - 0 ratings