Ckeditor5: Using is( elementName ) considered harmful

Created on 12 Jul 2020 · 13Comments · Source: ckeditor/ckeditor5

📝 Provide detailed reproduction steps (if any)

I've been recently trying to pass this SVG through the DOM \<-> view conversion:

<svg xmlns="http://www.w3.org/2000/svg" xmlns:wrs="http://www.wiris.com/xml/cvs-extension" height="21" width="17" wrs:baseline="16"><defs><style type="text/css"></style></defs><text font-family="NexusSerifWebPro" font-size="16" font-style="italic" text-anchor="middle" x="4.5" y="16">x</text><text font-family="NexusSerifWebPro" font-size="12" text-anchor="middle" x="12.5" y="9">2</text></svg>

Unfortunately, it throws at one point with:

foo.js:775 TypeError: Cannot read property 'charAt' of undefined
    at DomConverter._processDataFromViewText (boxes.js:70783)
    at DomConverter.viewToDom (boxes.js:70026)
    at DomConverter.viewChildrenToDom (boxes.js:70099)
    at viewChildrenToDom.next (<anonymous>)
    at DomConverter.viewToDom (boxes.js:70071)
    at DomConverter.viewChildrenToDom (boxes.js:70099)
    at viewChildrenToDom.next (<anonymous>)
    at DomConverter.viewToDom (boxes.js:70071)
    at HtmlDataProcessor.toData (boxes.js:47402)
    at stringifyViewItem (boxes.js:570)

The error comes from https://github.com/ckeditor/ckeditor5/blob/2f1568d/packages/ckeditor5-engine/src/view/domconverter.js#L960.

The issue is that the SVG uses a <text> element and that falls into this case:

https://github.com/ckeditor/ckeditor5/blob/2f1568d/packages/ckeditor5-engine/src/view/domconverter.js#L204-L209

Unfortunately, we made a very bad decision to allow checking element names with is( elementName ) instead of forcing a is( 'element', elementName ) notation.

Proposal

Unfortunately, I can't see an option to avoid introducing a breaking change.

Even if we'd replace all our is( elementName ) usage with is( 'element', elementName ) and add warning to is( elementName ) for cases where elementName != NODE_TYPE_ENUM to notify people about using a deprecated and unsafe API, the is( 'text' ) check that we would still have in our code (cause we're looking for a text node) would still make the SVG case fail.

To avoid a BC we'd need to introduce a new API for checking element type but that makes no sense as is() was created for primarily this reason.

Therefore, I'm for:

Completely removing support for is( elementName ).
Replacing all is( elementName ) usage with is( 'element', elementName ).
Optionally: Adding a console warning/error in is() methods when they are used with first param from beyond the NODE_TYPE_ENUM to help with migration. However, since it's fairly simple to find all is() usage, this would not be a big help.

If you'd like to see this fixed sooner, add a 👍 reaction to this post.

major engine 3 bug refactor

Source

Reinmar

Most helpful comment

@niegowski & @Reinmar I still find the RFC messy, so I'd try to summarize it in ADRish form (I've used a compilation from example templates). I've also compiled the solution and it sill might need some improvements (you're welcome to update below ADR)

Checking model/view element type and name

Context

The model/view nodes check for type and name:

text.is( 'text' );
text.is( 'model:text' );
element.is( 'tableCell' );
element.is( 'element', 'tableCell' );
element.is( 'model:element', 'tableCell' );

cause troubles if some view element has name that happens to be one of our element type (text, attributeElement, element, etc). The problem touches mostly the view, however nothing is preventing AFAIK defining 'element' in model's schema and conversion.

Decision drivers

performance - we already discovered performance gain while refactoring is() check - the new solution should not impact the performance
solid API design - we should spend more time discovering potential problems to not refactor this every now and then
DX - keep developer experience

Considered options

1. Special notation of text nodes in check :x:

Keep is check as now but change text nodes check:

text.is( '$text' )

for text nodes and keep current consistent form (already ditched as other elements types would also cause troubles).

2. A three part string for `is()` check

Special notation of is() argument: (model|view:)NODE_TYPE(:ELEMENT_NAME)

text.is( 'text' )
text.is( 'model:text' );
// no form for only "name check" element.is( 'tableCell' );
element.is( 'element:tableCell' );
element.is( 'model:element:tableCell' );

3. Two arguments `is()` check

Always require first argument as a type and optional second for name. Type argument can be prefixed with model: or view:.

text.is( 'text' )
text.is( 'model:text' );
element.is( 'element' );
element.is( 'element', 'tableCell' );
element.is( 'model:element', 'tableCell' );

Cons & Pros of the Options

1. Special notation

Pros:

Less refactoring.
In line with model schema declaration, differ, etc.

Cons:

Does only work for text nodes, doesn't solve element, attributeElement cases.
Minor breaking change.

2. Three part string for `is()` check

Pros:

By providing expected type (either by prefix or using 2 arguments) we avoid conflicts with text and maybe some other elements in future.

Cons:

It's breaking change but we don't see any better generic solution.
Might impact performance (depending on implementation)
Hard to explain in jsdoc (you'd need to read method docs to know how to construct string for this check).
Breaking change.

3. Two arguments `is()` check

Pros:

Clear definition of what the method check (no ambiguous typeOrName parameter).
No need to read jsdoc - only parameter names in most cases. Check for mode/view would still require to learn about special notation.

Cons:

More typing for element.is( 'tableCell' ).
Breaking change.

Recommendation

_(to be filled after discussion)_

jodator on 16 Jul 2020

❤2

All 13 comments

Therefore, I'm for:

Completely removing support for `is( elementName )`.

Replacing all `is( elementName )` usage with `is( 'element', elementName )`.

Optionally: Adding a console warning/error in `is()` methods when they are used with first param from beyond the `NODE_TYPE_ENUM` to help with migration. However, since it's fairly simple to find all `is()` usage, this would not be a big help.

My few loose notes on this, for now I don't have strong opinion on where to head with this:

What about model's API? We also have element.is() and from most of the usages I remember (outside engine) we check for element name, like element.is( 'image' ). (Con of removing this is a PITA in if( element.is( 'element', 'codeBlock ) ) .... I'm writing this because we have a pretty model/view symmetrical API and we tend to keep this in other places.

The other thing to consider is - why is() anyway - we already have .name property and it is not obvious which to use. (Pro for removing elementName support by completely removing this. However this would also means that .name cannot be used for "type" thus if BC is an option - why not introduce _type property just for node types.

I'd test the change with logging to the console in this part of API - I worry for a performance hit.

jodator on 13 Jul 2020

I was wondering if we could make different breaking change (possibly with smaller impact).

Currently in source files we have:

all is() calls: 339
is( 'text' ) calls: 63
is( 'textProxy' ) calls: 20

Also we use $text notation in:

Schema
Conversion ('insert:$text')
Differ entries
dev utils stringify() of model data

So maybe it would be more consistent, less breaking and still providing clean API if we replace is( 'text' ) with is( '$text' ) and also maybe is( 'textProxy' ) with is( '$textProxy' )?

niegowski on 13 Jul 2020

👍1

Unfortunately, I've got second doubts about is( '$text' ). Since this change will be breaking, it'd be perfect if it's THE_FINAL solution. We can't have THE_FINAL_2 in a year or week.

The problem is with is( 'attributeElement' ) or is( 'uiElement' ). I started thinking about these cases when reviewing https://github.com/ckeditor/ckeditor5/pull/7621/files.

We can't say for sure that no one uses <attributeElement> or <rawElement> in some specific cases. Actually, AFAIR, currently DomConverter does not maintain the letter case so you'd get <attributeelement> which would work, but there's a big chance that we'll have to fix that one day. And then, crashing at <attributeElement> would become a real threat.

What's more. There are more cases. <model:element> and it blows. <view:$text> perhaps as well. And more and more.

So, I'm again thinking about is( 'element', elementName ). It's just absolutely safe and future proof. Something, that we'll never have to revisit.

I know that your concern was that typing is( 'element', 'p' ) will be tiring, but to me it's an over exaggeration. Like with semicolons in JS. I'm for robustness rather than illusory usefulness. Also,

\.is\( '(?!element|text|selection|documentFragment|node|model|view|position|range|\w+Element|CKEditor)

gives me 150 results in src/ compared to 392 of the is( type scenario. So, we check the type more often anyway.

Reinmar on 15 Jul 2020

As discussed F2F we came to a solution that would use existing namespacing that we use already (i.e., is( 'model:element' ), is( 'model:node' )) so for elements we would allow (and require) notation: element:paragraph or model:element:paragraph.

But since the above would resolve the issue with <text> elements because it would require strict check for is( 'element:text') then there would not be original issue, but since we use $text notation in those places:

Schema
Differ
DowncastDispatcher - text attribute and insert:$text
dev-utils/model - setData and getData

Maybe we should also change Text#is( 'text' ) to Text#is( '$text' ) for consistency with mentioned places. We could also allow both notations (current one and new one).

niegowski on 15 Jul 2020

👍1

tlldr: To rephrase Kuba's comment, the notation would look like this: (model|view:)NODE_TYPE(:ELEMENT_NAME) and would be similar to event namespaces.

Reinmar on 15 Jul 2020

tlldr: To rephrase Kuba's comment, the notation would look like this: (model|view:)NODE_TYPE(:ELEMENT_NAME) and would be similar to event namespaces.

But also $text notation is a case to discuss.

niegowski on 15 Jul 2020

_Trying to put this into some more readable form._

Case 1

node.is( 'text' ) could return true for both Element with name 'text' and Text node. This is causing problems with SVG <text> nodes. Also the same conflict could happen for any other editor objects.

Currently in code we use is() in two scenarios:

Testing if object is an instance of one of editor classes (eg. element, selection, documentFragment, node, position, range...)
Testing name of Element (eg. paragraph, tableCell, p, td...)

Currently is() in some cases accepts prefixed argument like model:element, model:node, view:element. As Elements with specific name could be considered as virtual extension of generic Element so current implementation also accepts tableCell and model:tableCell.

Proposal

We could change implementation of Element#is() to always require argument if following form: (model|view:)NODE_TYPE(:ELEMENT_NAME) so NODE_TYPE would always be required and calls like element.is( 'tableCell' ) would become element.is( 'element:tableCell' ) (i guess current alternative notation element.is( 'element', 'tableCell' ) would be not changed. Only first argument would require type prefix if we want to check for specific element name.

Pros

By providing expected type (either by prefix or using 2 arguments) we avoid conflicts with text and maybe some other elements in future.

Cons

It's breaking change but we don't see any better generic solution.

This is RFC so maybe anyone could see come cons?

Case 2

At first it was prepared as changing Text#is() behavior to expect $text instead of text, this is consistent with mentioned above other places in editor (Schema, Differ, Conversion, model serialization). This would also avoid SVG text node issue. But this not solves problem of possible name conflict with any of the other build-in classes.

Pros

Code consistency with Schema, Differ, Conversion, model parser and serializer

Cons

Solves only case of text and there could be any other conflicts in the future.

Since this is considered "not enough" and SVG problem would be solved by case 1 then this change might be not needed. OTOH this would give more consistency in code. The question is if you see any cons for this change if we would push both cases at once. Both cases require breaking change so it would be better to push it once than in multiple iterations.

niegowski on 15 Jul 2020

Checking model/view element type and name

Context

The model/view nodes check for type and name:

text.is( 'text' );
text.is( 'model:text' );
element.is( 'tableCell' );
element.is( 'element', 'tableCell' );
element.is( 'model:element', 'tableCell' );

Decision drivers

performance - we already discovered performance gain while refactoring is() check - the new solution should not impact the performance
solid API design - we should spend more time discovering potential problems to not refactor this every now and then
DX - keep developer experience

Considered options

1. Special notation of text nodes in check :x:

Keep is check as now but change text nodes check:

text.is( '$text' )

for text nodes and keep current consistent form (already ditched as other elements types would also cause troubles).

2. A three part string for `is()` check

Special notation of is() argument: (model|view:)NODE_TYPE(:ELEMENT_NAME)

text.is( 'text' )
text.is( 'model:text' );
// no form for only "name check" element.is( 'tableCell' );
element.is( 'element:tableCell' );
element.is( 'model:element:tableCell' );

3. Two arguments `is()` check

Always require first argument as a type and optional second for name. Type argument can be prefixed with model: or view:.

text.is( 'text' )
text.is( 'model:text' );
element.is( 'element' );
element.is( 'element', 'tableCell' );
element.is( 'model:element', 'tableCell' );

Cons & Pros of the Options

1. Special notation

Pros:

Less refactoring.
In line with model schema declaration, differ, etc.

Cons:

Does only work for text nodes, doesn't solve element, attributeElement cases.
Minor breaking change.

2. Three part string for `is()` check

Pros:

By providing expected type (either by prefix or using 2 arguments) we avoid conflicts with text and maybe some other elements in future.

Cons:

It's breaking change but we don't see any better generic solution.
Might impact performance (depending on implementation)
Hard to explain in jsdoc (you'd need to read method docs to know how to construct string for this check).
Breaking change.

3. Two arguments `is()` check

Pros:

Clear definition of what the method check (no ambiguous typeOrName parameter).
No need to read jsdoc - only parameter names in most cases. Check for mode/view would still require to learn about special notation.

Cons:

More typing for element.is( 'tableCell' ).
Breaking change.

Recommendation

_(to be filled after discussion)_

jodator on 16 Jul 2020

❤2

After compiling the above ADR, my thoughts on this:

I worry that the 2 option is hard to explain in the jsdoc (you'd have to read the docs)
The 2 option must be checked for performance.
I'm for option 3 even though I was kinda against it before but taking attribureElement into consideration I think that it might be the best from the proposed.

jodator on 16 Jul 2020

❤1

Thank you, @jodator for writing it this way.

I'm ok with option 2, but I completely agree with you about its cons. Therefore, I also find 3, so the initial idea, a better option.

Reinmar on 16 Jul 2020

@jodator I like this ADR approach. I think this should become a standard in our project.

3 sounds good (lesser evil) to me.

oleq on 16 Jul 2020

@Reinmar & @niegowski I've found one place confusing a bit in the current/PR code.

It is about checking for Element being Node. The original is() check could check for is( 'node', 'div' ) AFAICS. The check for element.is( 'node' ) was removed by this comment: https://github.com/ckeditor/ckeditor5/issues/3979#issuecomment-280364928.

However, it looks like the check was later on brought back (ckeditor/ckeditor5-engine#1368) - probably due to foo.is( 'node' ) vs foo.is( 'position' ) checks (I didn't dig into it more).

So the current implementation (not the original one) https://github.com/ckeditor/ckeditor5/blob/master/packages/ckeditor5-engine/src/model/element.js#L118-L127:

will check:

element.is( 'node' ); // true
element.is( 'element' ); // true
element.is( 'element', 'name' ); // true
element.is( 'node', 'name' ); // false

Now, I'd like to have a confirmation that element.is( 'node', 'name' ) returning false is OK (nodes do not have name but OTOH if I would do node.is( 'node', 'name' ) would return true because the second parameter would be omitted by Node.is() check.

ps.: This topic Element vs Node comes back again - for me those are the same (AFAIR the Node uses code from Element).

jodator on 21 Jul 2020

To sum up, this change modifies arguments of is() methods of some model (Element, RootElement, Text, TextProxy) and view (Element, AttributeElement, ContainerElement, EditableElement, EmptyElement, UIElement, Text, TextProxy) classes.

Element name parameter

Before this change is() method allowed to omit element type argument and use only element name. This notation is no longer supported because of name vs type conflicts.

Model

Before:

element.is( 'image' );
rootElement.is( '$root' );

After:

element.is( 'element', 'image' );
rootElement.is( 'element', '$root' );
rootElement.is( 'rootElement', '$root' );

View

Before:

element.is( 'img' );
containerElement.is( 'div' );
emptyElement.is( 'img' );
attributeElement.is( 'b' );
editableElement.is( 'div' );
rootEditableElement.is( 'div' );
uiElement.is( 'span' );

After:

element.is( 'element', 'img' );
containerElement.is( 'element', 'div' );
emptyElement.is( 'element', 'img' );
attributeElement.is( 'element', 'b' ); 
editableElement.is( 'element', 'div' );
rootEditableElement.is( 'element', 'div' );
uiElement.is( 'element', 'span' );

Text and TextProxy type parameter

Before this change is() method expected 'text' / 'textProxy' as an argument but now it is changed to '$text' / '$textProxy' to be more consistent with Schema, Differ, etc. Note that old notation is still accepted for backward compatibility.

Before:

text.is( 'text' );
textProxy.is( 'textProxy' );

After:

text.is( '$text' );
textProxy.is( '$textProxy' );

niegowski on 29 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Timeline to 1.0?

benjismith · 3Comments

Gulp watch tasks fails on temporary files created by IDE

jodator · 3Comments

The "Bold" icon is not centered

Reinmar · 3Comments

I want to set the default input font color. What should I do with this change?

pandora-iuz · 3Comments

Online builder for CKEditor 5?

wwalc · 3Comments

Ckeditor5: Using is( elementName ) considered harmful

📝 Provide detailed reproduction steps (if any)

Proposal

Most helpful comment

Checking model/view element type and name

Context

Decision drivers

Considered options

1. Special notation of text nodes in check :x:

2. A three part string for is() check

3. Two arguments is() check

Cons & Pros of the Options

1. Special notation

2. Three part string for is() check

3. Two arguments is() check

Recommendation

All 13 comments

Case 1

Proposal

Pros

Cons

Case 2

Pros

Cons

Checking model/view element type and name

Context

Decision drivers

Considered options

1. Special notation of text nodes in check :x:

2. A three part string for is() check

3. Two arguments is() check

Cons & Pros of the Options

1. Special notation

2. Three part string for is() check

3. Two arguments is() check

Recommendation

Element name parameter

Model

View

Text and TextProxy type parameter

Related issues

2. A three part string for `is()` check

3. Two arguments `is()` check

2. Three part string for `is()` check

3. Two arguments `is()` check

2. A three part string for `is()` check

3. Two arguments `is()` check

2. Three part string for `is()` check

3. Two arguments `is()` check