Element-web: Nested mx-reply tags aren't handled properly

Created on 18 Jun 2020 · 16Comments · Source: vector-im/element-web

Description

Nested mx-reply tags aren't rendered properly - text from inside one of the mx-reply tags is rendered normally as if the person writing the message sent them

Steps to reproduce

Reply to a replied message with e.g. riot x which doesn't strip existing mx-reply tags from the fallback

Screenshot_2020-06-18_14-30-38

The person here only replied with "Blub?", yet it also puts "test2" in the body as if they said that, while they didn't.

{
  "content": {
    "body": "> <@sorunome:sorunome.de> > <@sorunome:sorunome.de> > <@sorunome:sorunome.de> könnte aber sein, dass wird en selben fail haben\n> > \n> > test1\n> \n> test2\n\nBlub? ",
    "format": "org.matrix.custom.html",
    "formatted_body": "<mx-reply><blockquote><a href=\"https://matrix.to/#/!wqRGMTABVWaWgpKFkY:famedly.de/$UNIEux2gtiFl-KE1Q9LpGD9ZtdwJeNYZqKp8xdsqUSc\">Als Antwort auf</a><a href=\"https://matrix.to/#/@sorunome:sorunome.de\">@sorunome:sorunome.de</a><br /><mx-reply><blockquote><a href=\"https://matrix.to/#/!wqRGMTABVWaWgpKFkY:famedly.de/$Q35oiMeJMHexsdIthrL7ig2G15QktZLeV41JEwiMadc\">In reply to</a> <a href=\"https://matrix.to/#/@sorunome:sorunome.de\">@sorunome:sorunome.de</a><br>> <@sorunome:sorunome.de> könnte aber sein, dass wird en selben fail haben\n\ntest1</blockquote></mx-reply>test2</blockquote></mx-reply>Blub? ",
    "msgtype": "m.text"
  },
  "room_id": "!wqRGMTABVWaWgpKFkY:famedly.de",
  "type": "m.room.message"
}

The reason for the m.relates_to missing in that json dump is that that example was in an e2ee room and thus m.relates_to is in the encrypted content, not the decrypted one.

While the spec (very non-clear, would need to be way clearer on that) doesn't allow nested mx-reply tags, this is handling invalid data improperly and could, at best, confuse someone and, at worse, make someone think that someone said something that they didn't

Version information

Platform: electron
OS: Arch Linuc
Version: 1.6.4

bug

Source

Sorunome

🎉2 👍2 👀1

Most helpful comment

Currently it treats it like a valid event and thus making things even weirder, though.

Sorunome on 18 Jun 2020

👍2

All 16 comments

As per the spec;

A special tag, mx-reply, may appear on rich replies (described below) and should be allowed if, and only if, the tag appears as the very first tag in the formatted_body. The tag cannot be nested and cannot be located after another tag in the tree

Read: cannot be nested - https://matrix.org/docs/spec/client_server/r0.6.1#m-room-message-msgtypes

If that is unclear, you may wish to open a matrix-doc spec clarification isssue

t3chguy on 18 Jun 2020

While the spec (very non-clear, would need to be way clearer on that) doesn't allow nested mx-reply tags, this is handling invalid data improperly and could, at best, confuse someone and, at worse, make someone think that someone said something that they didn't

Riot-web should still handle invalid data in a way that, at best, doesn't confuse people and, at worse, tricks people into thinking they sent stuff that they didn't. Heck, this could even be considered a security bug.

Sorunome on 18 Jun 2020

👍2

If it were to handle it, it'd refuse to render the reply entirely. Parsing these things is intentionally minimalistic for security reasons.

turt2live on 18 Jun 2020

That is what I just said in riot-web :D

t3chguy on 18 Jun 2020

Except counting won't work because people can always type the word "mx-reply". I'd be inclined to yell at clients which are sending invalid data as they aren't spec compliant.

turt2live on 18 Jun 2020

I guess also taking the spec at its word, Riot doesn't follow it exactly:

To strip the fallback on the formatted_body, the client should remove the entirety of the mx-reply tag.

(which would include the nested one)

t3chguy on 18 Jun 2020

I tried telling the RiotX team but then on the spot neither they or I could reproduce

t3chguy on 18 Jun 2020

I guess also taking the spec at its word, Riot doesn't follow it exactly:

To strip the fallback on the formatted_body, the client should remove the entirety of the mx-reply tag.

(which would include the nested one)

The spec assumes you have valid data.

turt2live on 18 Jun 2020

The fix would be to change the regex here https://github.com/matrix-org/matrix-react-sdk/blob/1f1f61377775014dfbe8303eb7d89d3ed6f5c520/src/components/views/elements/ReplyThread.js#L95 to be return html.replace(/^<mx-reply>[\s\S]+<\/mx-reply>/, '');, just saying.

Sorunome on 18 Jun 2020

That wouldn't work. Doing such without an xml parser to find the matching closing tag would be a PITA as there is nothing to stop someone having a stray </mx-reply> at the end of their message and now their entire message is hidden wrongly.

"<mx-reply>foo<mx-reply>bar</mx-reply></mx-reply> Their reply, what's an </mx-reply> anyway?".replace(/^<mx-reply>[\s\S]+<\/mx-reply>/, ""); yields ""

t3chguy on 18 Jun 2020

It's better a fix than having it render text as if someone sent it that they didn't send.

Sorunome on 18 Jun 2020

Its better to treat invalid events as invalid events :)

t3chguy on 18 Jun 2020

👎1

SO just not show the events at all? sure

Sorunome on 18 Jun 2020

👍2

Currently it treats it like a valid event and thus making things even weirder, though.

Sorunome on 18 Jun 2020

👍2

That wouldn't work. Doing such without an xml parser to find the matching closing tag would be a PITA as there is nothing to stop someone having a stray </mx-reply> at the end of their message and now their entire message is hidden wrongly.

"<mx-reply>foo<mx-reply>bar</mx-reply></mx-reply> Their reply, what's an </mx-reply> anyway?".replace(/^<mx-reply>[\s\S]+<\/mx-reply>/, ""); yields ""

In the formatted body such the trailing text would be typically html escaped and be </mx-reply>. Sure, some clients might not properly html escape, but that leads to even more issues on their end.

Sorunome on 18 Jun 2020

fixed by https://github.com/matrix-org/matrix-react-sdk/pull/5006