Let me first thank you for the highly anticipated code block feature! Apart from this issue it seems to work great and covers all basics already.
Create empty code block:

Call textEditor.getData() and get <pre><code class="language-text-plain"> </code></pre><p>text</p>
Create new instance of CKEditor and call textEditor.getData('<pre><code class="language-text-plain"> </code></pre><p>text</p>') - i.e. with what .getData() previously returned.


Similar problem happens when code block contains "&", after which is returned from .getData() as & but then displayed literally from .setData().
If you'd like to see this fixed sooner, add a 👍 reaction to this post.
Hi! Great to hear that you like this feature :)
I'm not sure if we can consider it as a bug - it's a code block, so you want to display the code without executing it and HTML entities might be a part of the code.
However, the current inconsistent behaviour between getData and setData might be confusing. WDYT? @Reinmar @mlewand
Imagine I'm writing a blog post about programming so it has both text and code blocks. I save a draft to the database (so call .getData()) and another day I want to continue editing the draft (so it's loaded from DB and set with .setData()), but it's not what I saw/edited yesterday with those mangled entities.
Unless I'm missing something I can't fix this behavior in the host code in any reasonable way which makes me think this inconsistency is not just confusing, but a bug.
Doh, try this:
editor.setData('<pre><code class="language-text-plain"></code></pre><p>text</p>');
And then:
editor.setData(editor.getData())
And the second snippet will change the actual editor data (the thing that you can see in the editor).
So this is definitely a bug.
I would like to note that the impact of this bug is quite large for some use cases - some programming languages (e.g. Java, JavaScript, C++, PHP) use ampersand quite heavily and this bug makes it impossible to use code blocks for these ...
Just to second @zadam, as I've mentioned in my report @ https://github.com/zadam/trilium/issues/917, each save increases the text over time. So right now, in a particular bash snippet that started out containing &&, it now has morphed into &amp;amp;amp;amp;amp;amp;&amp;amp;amp;amp;amp;amp;!!
Have you a change to gives us an approximate date when this issue will be fixed?
Or givs a change wor a hotfix? I have many entries wir code snippeds and all thise are broken now
Here is a partial fix: https://github.com/neongreen/ckeditor5-code-block/commit/53d05418d00aff17468e682626cc59ac0697777a
Can be applied to and other entities too. Of course, a proper fix should replace all entities, not just three–four specific ones.
Hm... extractDataFromCodeElement() is odd. I don't know why we don't work on the view representation of the <code> element there. Such issues would not happen if we did.
cc @oleq, do you remember why you went this way? Was there a problem with the view structure or you just went for what was easier to code?
cc @oleq, do you remember why you went this way? Was there a problem with the view structure or you just went for what was easier to code?
It's been a while so it's hard to tell. It's probably related to the data processor we went with around here https://github.com/ckeditor/ckeditor5/blob/ec1de8c66dac28ace93930718f688c178c218090/packages/ckeditor5-code-block/src/converters.js#L186-L188.
It looks like extractDataFromCodeElement does the clean-up after the DP. Maybe we should use a different DP then?
I think we could remove the entire stringification part and then we wouldn't need the DP at all.
Some notes from my talk with @oleq:
<pre><code> element, when it's empty in the editor, must be outputted with either or \n to the data. The goal here is the same as for other blocks – to make sure that this element has height when rendered (so it matches what the user sees in the editor).toHtml( viewCode ) transformation, then extract the content of the <code> element and take the HTML that was inside (because it's HTML at this stage, not a plain text) and load it into the model. HTML is not plain text – in HTML entities are encoded (<, >, &) and they need to be de-encoded before being treated as a plain text. In the model, the content of the <code> element is a combination of text nodes and <softBreak> elements.So, the quick fix is as proposed before – to de-encode & to &.
The real solution is to not use the data processor here, but instead iterate over the view <code> element's content, look for text nodes and concatenate all them together.
I checked and this should work:
diff --git a/packages/ckeditor5-code-block/src/codeblockediting.js b/packages/ckeditor5-code-block/src/codeblockediting.js
index e819e08933..7e769d8824 100644
--- a/packages/ckeditor5-code-block/src/codeblockediting.js
+++ b/packages/ckeditor5-code-block/src/codeblockediting.js
@@ -128,7 +128,7 @@ export default class CodeBlockEditing extends Plugin {
editor.editing.downcastDispatcher.on( 'insert:codeBlock', modelToViewCodeBlockInsertion( model, normalizedLanguagesDefs, true ) );
editor.data.downcastDispatcher.on( 'insert:codeBlock', modelToViewCodeBlockInsertion( model, normalizedLanguagesDefs ) );
editor.data.downcastDispatcher.on( 'insert:softBreak', modelToDataViewSoftBreakInsertion( model ), { priority: 'high' } );
- editor.data.upcastDispatcher.on( 'element:pre', dataViewToModelCodeBlockInsertion( editor.data, normalizedLanguagesDefs ) );
+ editor.data.upcastDispatcher.on( 'element:pre', dataViewToModelCodeBlockInsertion( editor.data, editor.editing.view, normalizedLanguagesDefs ) );
// Intercept the clipboard input (paste) when the selection is anchored in the code block and force the clipboard
// data to be pasted as a single plain text. Otherwise, the code lines will split the code block and
diff --git a/packages/ckeditor5-code-block/src/converters.js b/packages/ckeditor5-code-block/src/converters.js
index 7ee4a6324f..2fdd5b7845 100644
--- a/packages/ckeditor5-code-block/src/converters.js
+++ b/packages/ckeditor5-code-block/src/converters.js
@@ -131,7 +131,7 @@ export function modelToDataViewSoftBreakInsertion( model ) {
* configuration passed to the feature.
* @returns {Function} Returns a conversion callback.
*/
-export function dataViewToModelCodeBlockInsertion( dataController, languageDefs ) {
+export function dataViewToModelCodeBlockInsertion( dataController, editingView, languageDefs ) {
// Language names associated with CSS classes:
//
// {
@@ -183,8 +183,11 @@ export function dataViewToModelCodeBlockInsertion( dataController, languageDefs
writer.setAttribute( 'language', defaultLanguageName, codeBlock );
}
- const stringifiedElement = dataController.processor.toData( viewChild );
- const textData = extractDataFromCodeElement( stringifiedElement );
+ const textData = [ ...editingView.createRangeIn( viewChild ) ]
+ .filter( current => current.type === 'text' )
+ .map( ( { item } ) => item.data )
+ .join( '' );
+
const fragment = rawSnippetTextToModelDocumentFragment( writer, textData );
writer.append( fragment, codeBlock );
@@ -234,14 +237,3 @@ export function dataViewToModelCodeBlockInsertion( dataController, languageDefs
}
};
}
-
-// Returns content of `<pre></pre>` with unescaped html inside.
-//
-// @param {String} stringifiedElement
-function extractDataFromCodeElement( stringifiedElement ) {
- const data = new RegExp( /^<code[^>]*>([\S\s]*)<\/code>$/ ).exec( stringifiedElement )[ 1 ];
-
- return data
- .replace( /</g, '<' )
- .replace( />/g, '>' );
-}
The change causes two failing tests, though.
But I think that they were wrong in the first place and that's the reason why they fail. If anyone loads editor.setData( '<pre><code><p>Foo</p></code></pre>' ); it's an invalid HTML.
CodeBlockEditing
data pipeline v -> m conversion
✖ should convert pre > code with HTML inside
Chrome 81.0.4044 (Mac OS X 10.15.4)
AssertionError: expected '<codeBlock language="plaintext">[]Foo<softBreak></softBreak>Bar</codeBlock>' to equal '<codeBlock language="plaintext">[]<p>Foo</p><softBreak></softBreak><p>Bar</p></codeBlock>'
+ expected - actual
-<codeBlock language="plaintext">[]Foo<softBreak></softBreak>Bar</codeBlock>
+<codeBlock language="plaintext">[]<p>Foo</p><softBreak></softBreak><p>Bar</p></codeBlock>
at Context.eval (webpack:///./packages/ckeditor5-code-block/tests/codeblockediting.js?:913:122)
✖ should convert pre > code tag with HTML and nested pre > code tag
Chrome 81.0.4044 (Mac OS X 10.15.4)
AssertionError: expected '<codeBlock language="plaintext">[]FooBarBiz</codeBlock>' to equal '<codeBlock language="plaintext">[]<p>Foo</p><pre>Bar</pre><p>Biz</p></codeBlock>'
+ expected - actual
-<codeBlock language="plaintext">[]FooBarBiz</codeBlock>
+<codeBlock language="plaintext">[]<p>Foo</p><pre>Bar</pre><p>Biz</p></codeBlock>
at Context.eval (webpack:///./packages/ckeditor5-code-block/tests/codeblockediting.js?:925:122)
FYI, I'm about to create a PR with changes proposed by @oleq with additional tests and docs.
PR that fixes OP is drafted at https://github.com/ckeditor/ckeditor5/pull/7240.
It still has one issue with dom-to-view conversion (details in the PR)
Closing as done, with smaller followup issue https://github.com/ckeditor/ckeditor5/issues/7259
Now, totally empty (with no & inside) code block gets rendered with one non-breaking space.
Most helpful comment
Have you a change to gives us an approximate date when this issue will be fixed?
Or givs a change wor a hotfix? I have many entries wir code snippeds and all thise are broken now