editor.getData() and copy the string.htmlEmbed.showPreviews to false.The editor crashes permanently.
Error from Chrome:
snippet.js:4 Uncaught (in promise) Error: Failed to execute 'setAttribute' on 'Element': 'picture-in-picture\"' is not a valid attribute name.
at b.viewToDom (snippet.js:4)
at b.viewChildrenToDom (snippet.js:4)
at viewChildrenToDom.next (<anonymous>)
at b.viewToDom (snippet.js:4)
at b.viewChildrenToDom (snippet.js:4)
at viewChildrenToDom.next (<anonymous>)
at b.viewToDom (snippet.js:4)
at o.toData (snippet.js:4)
at model (snippet.js:4)
at snippet.js:4
Error from Safari:
Unhandled Promise Rejection: CKEditorError: The string contains invalid characters.
Read more: https://ckeditor.com/docs/ckeditor5/latest/framework/guides/support/error-codes.html#error-The string contains invalid characters.
I was able to reproduce it only in docs, the manual test works fine for me.
If you'd like to see this fixed sooner, add a ๐ reaction to this post.
Example HTML:
<h2>CKEditor 5 on socials</h2><div class=\"raw-html-embed\"><blockquote class=\"twitter-tweet\"><p dir=\"ltr\" lang=\"en\">Keeping CKEditor 5 integrations tidy and free of bugs ๐ The new release for the official <a href=\"https://twitter.com/hashtag/Angular?src=hash&ref_src=twsrc%5Etfw\">#Angular</a> WYSIWYG editor component works with Angular 9.1 and above. <a href=\"https://t.co/Umag7F9dq0\">https://t.co/Umag7F9dq0</a> <a href=\"https://t.co/cIPy5fFSDn\">pic.twitter.com/cIPy5fFSDn</a></p>โ CKEditor Ecosystem (@ckeditor) <a href=\"https://twitter.com/ckeditor/status/1318575258153160707?ref_src=twsrc%5Etfw\">October 20, 2020</a></blockquote><script charset=\"utf-8\" src=\"https://platform.twitter.com/widgets.js\" async=\"\"></script></div><div class=\"raw-html-embed\"><iframe allowfullscreen=\"\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" frameborder=\"0\" src=\"https://www.youtube-nocookie.com/embed/DgM5DfAPQTI\" height=\"315\" width=\"560\"></iframe></div><p>Psst, we're tracking ya!</p><div class=\"raw-html-embed\"><script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-XXXXX-Y', 'auto'); ga('send', 'pageview');ย </script></div>
It fails when we try to get raw HTML from the view element.
The HTML used to test is invalid. Our tools don't parse it correctly:

Everything works fine with clean structure:
<h2>CKEditor 5 on socials</h2>
<div class="raw-html-embed">
<blockquote class="twitter-tweet"><p dir="ltr" lang="en">Keeping CKEditor 5 integrations tidy and free of bugs ๐ The new release
for the official <a href="https://twitter.com/hashtag/Angular?src=hash&ref_src=twsrc%5Etfw">#Angular</a> WYSIWYG editor
component works with Angular 9.1 and above. <a href="https://t.co/Umag7F9dq0">https://t.co/Umag7F9dq0</a> <a
href="https://t.co/cIPy5fFSDn">pic.twitter.com/cIPy5fFSDn</a></p>โ CKEditor Ecosystem (@ckeditor) <a
href="https://twitter.com/ckeditor/status/1318575258153160707?ref_src=twsrc%5Etfw">October 20, 2020</a></blockquote>
<script charset="utf-8" src="https://platform.twitter.com/widgets.js" async=""></script>
</div>
<div class="raw-html-embed">
<iframe allowfullscreen="" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" frameborder="0" src="https://www.youtube-nocookie.com/embed/DgM5DfAPQTI" height="315" width="560"></iframe></div><p>Psst,
we're tracking ya!</p>
<div class="raw-html-embed">
<script>( function ( i, s, o, g, r, a, m ) {
i[ 'GoogleAnalyticsObject' ] = r;
i[ r ] = i[ r ] || function () {
( i[ r ].q = i[ r ].q || [] ).push( arguments )
}, i[ r ].l = 1 * new Date();
a = s.createElement( o ), m = s.getElementsByTagName( o )[ 0 ];
a.async = 1;
a.src = g;
m.parentNode.insertBefore( a, m )
} )( window, document, 'script', 'https://www.google-analytics.com/analytics.js', 'ga' );
ga( 'create', 'UA-XXXXX-Y', 'auto' );
ga( 'send', 'pageview' ); </script>
</div>
I'm wondering whether we should try to render invalid structures. Couldn't we display a container with an error message if something went wrong during parsing the input HTML?
I'm wondering whether we should try to render invalid structures. Couldn't we display a container with an error message if something went wrong during parsing the input HTML?
Yeah, our parser and HTML->View->HTML processing is a bit limited, so it's actually a good idea.
Also, which is a bigger problem, I've just realized that ideally, we should take the DOM structure directly from HtmlDataProcessor and avoid processing it to a view structure as it's going to:
I think it could be doable to have an instance of the DOM converter in upcast converters. It'd require some changes in DataController#parse() and UpcastDispatcher#convert().
This might also help with tricky cases like what MathType guys do. Moreover, with some more work, it could lead to a nice optimization โ we could configure the DomConverter to stop processing certain structures (e.g. inside Raw HTML embed and inside MathML).
cc @jodator
I think it could be doable to have an instance of the DOM converter in upcast converters. It'd require some changes in
DataController#parse()andUpcastDispatcher#convert().This might also help with tricky cases like what MathType guys do. Moreover, with some more work, it could lead to a nice optimization โ we could configure the
DomConverterto stop processing certain structures (e.g. inside Raw HTML embed and inside MathML).
The HtmlDataProcessor is already using DomConverter to convert a DOM tree to the view DocumentFragment.
In branch https://github.com/ckeditor/ckeditor5/compare/i/8323 I prepared a WIP of the approach, where we are able to register some elements that should be treated by the DomConverter (while converting from DOM to view) as their content would be a CDATA and convert it to a single view Text node. This solves the original issue and also allows us to keep code comments and original HTML structure (at least DomConverter is not affecting it, it still can get some changes while creating a view Text node from data generated by domNode.innerHTML).
The idea behind it is nice IMO. It is easy to extend AFAICS. As the POC looks OK, I'd expose this API on editor.data.processor probably (as in your comment).
For the solution:
_cdataElementMatcher name to something more verbose and a bit separated from the actual <![CDATA[. My worry here is that is a bit too cryptic.I like the idea very much :+1:
- I would change
_cdataElementMatchername to something more verbose and a bit separated from the actual<![CDATA[. My worry here is that is a bit too cryptic.
I'm also unsure whether using the cdata concept will convey the message to developers.
My other worry is โ where can we expose this API? If on the data processor, then only on XML/HTML ones? Or on the MD data processor too? How will a feature know whether it can use this API? By duck typing? Or will this be an abstract method of the base data processor class? Or should we move this API elsewhere?
My next worry is โ what about performance? If every node that's being converted from DOM to View is being checked with our Matcher it may have a significant impact on performance, even if this matcher has no rules. Could you check it? Can we secure us from this? Or maybe we should not use Matcher but a simple callback check?
3. A case for the MathML example would be nice later on.
:+1: cc @xaviripo
My other worry is โ where can we expose this API? If on the data processor, then only on XML/HTML ones? Or on the MD data processor too? How will a feature know whether it can use this API? By duck typing? Or will this be an abstract method of the base data processor class? Or should we move this API elsewhere?
I think that every processor should implement it. For example code blocks in markdown could be processed in this way (this could reduce code in some other places).
My next worry is โ what about performance? If every node that's being converted from DOM to View is being checked with our Matcher it may have a significant impact on performance, even if this matcher has no rules. Could you check it? Can we secure us from this? Or maybe we should not use Matcher but a simple callback check?
I'll check that, but by looking at the Matcher code I believe that it shouldn't affect performance (at least not to much).
- A case for the MathML example would be nice later on.
What do you need from our side?
My next worry is โ what about performance? If every node that's being converted from DOM to View is being checked with our Matcher it may have a significant impact on performance, even if this matcher has no rules. Could you check it? Can we secure us from this? Or maybe we should not use Matcher but a simple callback check?
I checked the performance in /tests/manual/performance/setdata.html and I don't see any change in the performance of DomConverter#domToView().
- A case for the MathML example would be nice later on.
What do you need from our side?
Nothing :) Just wanted to let you know that we'll work on such a PoC. Perhaps you'll be able to update your implementation accordingly.
Oh, my bad.
Yeah, this looks promising. It's been a while since I've taken a look at our plugin's code but IIRC we had some possible improvements that depend on this issue.
If Wiris can help somehow, just let me know.
Most helpful comment
I checked the performance in
/tests/manual/performance/setdata.htmland I don't see any change in the performance ofDomConverter#domToView().