Element-web: Can't escape > at the start of message with \ in the new composer

Created on 25 Oct 2019 · 16Comments · Source: vector-im/element-web

Description

The old composer sent \>hello as >hello, but now it sends \>hello.

Version information

Platform: web

For the web app:

Browser: Firefox
URL: self-hosted 1.5.0

Community PR bug defect composer

Source

lnicola

👍6

Most helpful comment

While having /plain as an option is fine when you plan on avoiding markdown in multiple places, having \ as an option to escape a single instance would be more efficient, and is used in other applications for that reason

BurnyBoi on 27 Oct 2019

👍7

All 16 comments

Use the /plain command to send plaintext

t3chguy on 25 Oct 2019

Thanks. Is this by design?

lnicola on 25 Oct 2019

BurnyBoi on 27 Oct 2019

👍7

backslash escaping should work fine.

ara4n on 27 Oct 2019

Noting that this seems to apply to any formatting characters at the beginning of a message

k80w on 28 Oct 2019

@dnaf Unless I'm missing something, it seems to apply to any formatting characters everywhere in a message.

lnicola on 28 Oct 2019

Markdown parsing seems quite odd/inconsistent now. Looks like backslash escapes only work if unescaped markdown is used elsewhere in the same message?

This makes it different from every other markdown/commonmark implementation I've ever seen, and having to use /plain specifically for the case where no other markdown is used, rather than the standard escape sequences that should work everywhere, feels very inconsistent to me.

The *quick* \*brown\* fox jumps over the lazy dog.

renders

The quick *brown* fox jumps over the lazy dog.

Which is fine.

The quick \*brown\* fox jumps over the lazy dog.

renders

The quick \*brown\* fox jumps over the lazy dog.

which is weird.

The quick* \*brown\* fox jumps over the lazy dog.

renders

The quick* \*brown\* fox jumps over the lazy dog.

which is weird.

BobVul on 21 Nov 2019

❤4

For the time being, switching to the old composer works for me.

If for some reason this is actually desired behaviour (??), maybe some form of guessing if the user intended to apply markdown, could it be made into a separate preference?

BobVul on 21 Nov 2019

This may be a separate issue but there all these do not work for escaping @room. Is this intentional? Should I create a new issue for that?

l-e-e-o on 3 Dec 2019

Escaping @room is not possible - it's a spec limitation.

turt2live on 3 Dec 2019

Since the old markdown parser has been removed completely, there is no longer any straightforward workaround. As before, requiring /plain in these cases is weird.

So there's a few things:

Should we rename this issue or create a new issue? It's clear that the issue is far wider-reaching than simply escaping a block quote. All completely-escaped messages are interpreted as non-markdown, thus showing the backslashes where they shouldn't.
Is there any desire to fix this? It's clearly a bug: part of the code thinks a fully-escaped message is markdown, while another part thinks it's not.
I'm happy to create a pull request, but I'm not sure what the scope of the change should be. I've already tracked down why this occurs, and a fairly sledgehammer-y approach at fixing it, but I don't know the full implications. I'd assume someone more familiar with the codebase would have a better idea.

It boils down to the serialisation not performing a Markdown-to-HTML conversion if it thinks the message is plaintext: https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/editor/serialize.js#L38-L44

Unfortunately, .isPlainText() relies on the CommonMark node type of text as seen in: https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/Markdown.js#L71-L92

The problem is CommonMark describes nodes as text after parsing escapes. It thinks \*test\* is *test* and a text node. Which is correct: the post-parse node is unformatted text. Unfortunately, Riot then takes that to mean the pre-parse input should be displayed as-is, which is incorrect.

I think this looks like just an optimisation, and the easy fix would be to disable the optimisation by not using .isPlainText() at all. Anything that goes through the Markdown path should always use .toHTML() or similar to get the post-parse output. This would not affect /plain as it does not go down this code path at all.

BobVul on 31 Jan 2020

👍1

I see now there's also a .toPlaintext() renderer that seems designed for this purpose exactly, and was used to fix a previous occurrence of the exact same issue in #2870. Is there any reason the same thing cannot be done here?

I'll make a PR...

BobVul on 31 Jan 2020

Looking more at the previous fix, it's a bit more complex than I expected... rather than providing post-parse text to the formatted body, it provides post-parse text to the unformatted body? Which means perhaps the new equivalent should be in either createMessageContent directly or in textSerialize. I'm not sure what the implications of changing textSerialize might be, though, and doing it this way means either the Markdown parsing needs to be lifted to a higher level or it needs to be run twice (with the associated performance impact, however minor). But lifting it to a higher level also means both the Send and Edit message composers need to be changed.

I guess I'll wait until someone more familiar with the composer can comment?

My main questions are:

Can we just have escaped-only markdown (currently interpreted as plaintext and leaking the backslashes through) send as HTML (formatted_body) as any other markdown does? Easy way out and people can still /plain to force an actual non-formatted one if that's needed for some reason. This would turn the fix to the issue into a one-line change.
- This makes sense to me, even though it might differ a bit from the previous fix's behaviour. Unformatted body can and should include the original source text, backslashes and all.
If not, then where would be the best place to make this change? Looks like it could be reasonably easy to change textSerialise at the cost of double-parsing the markdown. The toPlaintext() function is also documented as producing HTML if the input is actually formatted so it can't just be run in all cases.

https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/components/views/rooms/SendMessageComposer.js#L73-L82

https://github.com/matrix-org/matrix-react-sdk/blob/6f6d6b096a0e752f0ad649d70cd541008334c634/src/editor/serialize.js#L46-L61

BobVul on 31 Jan 2020

Had to revert https://github.com/matrix-org/matrix-react-sdk/pull/4008, so reopening the ticket.