The new editor does not respect new lines in shortcode content. It's impossible to use the new editor to edit posts which depend on this. For me this is a blocker; it breaks my sites.
I have a number of shortcodes where each line in the content is expected to be processed separately. With Gutenberg v2.0.0 it's impossible to safely edit a post containing these shortcodes without the content being broken. Take for example the [bw_csv] shortcode which converts CSV content into a table.
[bw_csv]One,Two,Three
A,B,C
D,E,F
[/bw_csv]
During shortcode expansion the first line becomes table the table heading.
Subsequent lines are table rows. So the expected output is like this.
One | Two | Three
--- | ---- | ----
A | B | C
D | E | F
When the new lines are removed by the editor the shortcode does not expand correctly. Everything is treated as the table heading.
The basic problem can be demonstrated even without the shortcode being active.
In the current editor you can quite happily switch between Visual and Text and the site is not broken.
Once the content is converted the new lines in the original source are lost.
Repeat the above steps for another new post
I've also tried using different block types: Shortcode, Custom HTML.
I've been unable to find any existing block that respects my new lines.
Shortcode block should respect the content as typed.



During a direct discussion with @aduth regarding finding a solution to respect new lines in the content for my CSS block, Andrew commented
Hmm, it seems like the beautifier can be a bit aggressive. Maybe need some metrics to tell it to not perform (maybe if there鈥檚 no tags? if a block opts out?)
I'm going to look into the options.
But first, I'd like to know why the function is needed at all?
For the shortcode block, replacing
return beautifyHtml( content, {
indent_inner_html: true,
wrap_line_length: 0,
} );
with
return content;
appeared, at first glance, to resolve my issues nicely.
Why do other blocks need it?
Related: #633
With work toward a custom serializer in #5897, I'm wondering if we ought to just build some basic beautification into the serializer, rather than a step which occurs after block output has been generated. This way we can have more awareness of what should and shouldn't have changes applied.
During WordCamp London contributor day I discussed this issue with @dmsnell. He suggested revisiting the reason that the HTML beautifier was implemented in #633.
So the original question stands... why do we need to invoke beautifyHtml at all?
I'll drop by here as well and share a brief thought on beautification, which actually I like. We have a fundamental problem with post_content as soon as we have shortcodes because they immediately break anything that operates on HTML. Something like jsbeautify is probably guaranteed to break in random unexpected ways because it thinks it has HTML when actually it doesn't - the same kinds of problems we have had with WordPress posts for years.
That is, if we want to beautify we probably need to build a custom beautifier and _attempt_ to preserve shortcodes. Frankly, this is probably also a losing venture since we cannot preserve shortcodes in the general sense because they are unspecified. However, we might be able to preserve shortcodes within a block and make a hard cut-off saying "no shortcodes may span block boundaries."
Actually this problem makes me want to cry because basically I can't see any practical way to beautify _and_ not destroy shortcodes. We could build another layer in the parser to detect _some_ shortcodes which wrap content (but we could not get them all) and then wrap those in something before beautifying to try to preserve them, but that also would add considerable computational complexity.
I have a local branch which contains most of the work toward building beautification into the serializer, in a way which is not particularly opinionated (e.g. if newlines existed in a shortcode, the serializer would output them verbatim). The remaining work is largely edge cases, but the recurring pain point in working through them is an understanding of what we mean with "beautification".
As I understand it, the original issue #633 was targeted at what was at-the-time a large continuous string of text for the entirety of post content. A simplest solution may have been to add newlines between block demarcations. Still not great for more complex markup; traditionally one would expect some indented hierarchy of elements to be most readable. But what are the rules for said indentation?
js-beautify considers some objectives like ideal line length, which may be overkill for what we're trying to accomplish.
In my branch, the working naive implementation I adopted was "place on own line and indent non-inline descendent nodes", producing something like (from my own test cases):
const result = renderElement(
<section>
{ '\n\tString' }
<h2>Hello <em>World!</em></h2>
<span>On previous line</span>
<p>This is content.</p>
<div>
<span>Nesting:</span>
<p>Can get out of control.</p>
</div><pre><code>{ 'foo\nbar\t' }</code></pre>
</section>
);
<section>
String
<h2>Hello <em>World!</em></h2><span>On previous line</span>
<p>This is content.</p>
<div><span>Nesting:</span>
<p>Can get out of control.</p>
</div>
<pre><code>foo
bar </code></pre>
</section>
FWIW, another example of this is the [sourcecode] shortcode from the SyntaxHighlighter plugin, where you end up with many lines of code squished onto a single line.
Most helpful comment
During WordCamp London contributor day I discussed this issue with @dmsnell. He suggested revisiting the reason that the HTML beautifier was implemented in #633.
So the original question stands... why do we need to invoke
beautifyHtmlat all?