Mdx: A `|` in a code in a table breaks the table

Created on 1 Oct 2020  ·  13Comments  ·  Source: mdx-js/mdx

Subject of the issue

SSIA

Your environment

  • OS: macOS
  • Packages: @mdx-js/runtime v1.6.18
  • Env: node 14.2.0, yarn 1.22.4

Steps to reproduce

Sample:
https://codesandbox.io/s/brave-goldberg-qe2uv?file=/src/Content.mdx

Read the markdown below with mdx.

| title1 | title2               | title3                         |
|--------|----------------------|--------------------------------|
| foo    | `string | undefined` | this message should be visible |

Expected behaviour

The string | undefined is recognized as a code.

スクリーンショット 2020-10-01 23 41 20
(This capture is Visual Studio Code behavior)

Actual behaviour

The | in the code (`) is recognized as a table separator.

スクリーンショット 2020-10-01 23 43 37

https://codesandbox.io/s/brave-goldberg-qe2uv?file=/src/Content.mdx

⛵️ statureleased 🐛 typbug 👶 semvepatch 🗄 areinterface

Most helpful comment

GitHub doesn’t allow a pipe in code (text) in a table cell. You can escape it here on GitHub (note: escaping stuff in code normally doesn’t work, but GFM made an exception for pipes).

So, it seems we’re following GFM here. Could you check if escaping works?

All 13 comments

Appendix: GitHub Behavior

| title1 | title2 | title3 |
|--------|----------------------|--------------------------------|
| foo | string | undefined | this message should be visible |

oh...

GitHub doesn’t allow a pipe in code (text) in a table cell. You can escape it here on GitHub (note: escaping stuff in code normally doesn’t work, but GFM made an exception for pipes).

So, it seems we’re following GFM here. Could you check if escaping works?

@wooorm thanks for advising!

GFM Behavior

| title1 | title2 | title3 |
|--------|-----------------------|--------------------------------|
| foo | string \| undefined | this message should be visible |

MDX Behavior

https://codesandbox.io/s/sad-meadow-5iwcw?file=/src/Content.mdx

スクリーンショット 2020-10-02 0 07 26

escaping works, however \ is remaining.

Hmm. That’s annoying. Well, that’s indeed a bug!

Something similar was fixed in GFM plugin for Remark 3, how can this fix be "backported" to MDX? Any clue? https://github.com/syntax-tree/mdast-util-gfm-table/commit/9e0e488252e5d02dc49e3475e62fde1f2cb22808

That is not trivial, because MDX is made on the previous parser in remark, and this is for the new parser. Switching parsers is a serious breaking change.

There is a quick solution for this, as I understand it, it can be fixed in the mdxAstToMdxHast function, something like this:

      inlineCode(h, node) {
        return Object.assign({}, node, {
          type: 'element',
          tagName: 'inlineCode',
          properties: {},
          children: [
            {
              type: 'text',
-              value: node.value
+              value: node.value.replace(/\\(?=\|)/, '')
            }
          ]
        })
      },

You can test this regular expression on Regex 101 https://regex101.com/r/M8izhu/1

How acceptable/properly is this fix?

Btw, if I remember correctly, this was already solved in earlier remark. So if this is occurs here, it’s a bug here.

Your regex could be simplified by using \\| -> |.

It might be the fix we’re looking for, but it needs tests to confirm whether it is

I am tested on MDX Playground and this bug occurs there:

image

Your regex could be simplified by using \| -> |.

What do you mean, could you please send link to regex101?

It might be the fix we’re looking for, but it needs tests to confirm whether it is

This is what worries me the most, because it is not clear what to test (the current use case or also other different options for using inline code in table cell eg).

I meant:

-              value: node.value
+              value: node.value.replace(/\\\|/, '|')
            }

It’s a bit complex to explain, but this also doesn’t work, because your expression will match \\|, but it shouldn’t. See:
https://github.com/micromark/micromark-extension-gfm-table/blob/a16f719d781d24555c94b57bae65327d4c9279c6/html.js#L129

This is what worries me the most, because it is not clear what to test (the current use case or also other different options for using inline code in table cell eg).

This thread has a couple of examples, so at least those. Some more examples can be found here: https://github.com/micromark/micromark-extension-gfm-table/blame/a16f719d781d24555c94b57bae65327d4c9279c6/test/input.md#L169

Therefore, I used positive lookahead to avoid such issues (\\| should be \|, not / in case /\\\|/). It isn't wrong, is it?

Your regex still matches the literal markdown \\|, encoded in JavaScript as '\\\\|', because it doesn’t match the first slash, and does math the second slash plus pipe. Whereas according to GitHub, the slash slash matches, and the pipe then exits the cell.

I linked a couple fixtures in my previous comment. If we’re going to try and solve this, it would be good to test several edge cases!

Hi all! I’m going to close this as it landed on the main, the (now default) branch we’re using to gear up to the MDX@2 release.

The reason is so that it’s more clear which issues still need to be solved already and don’t require further work.

Expect another next release soonish, which might include it, or maybe it’s already released as a beta!

For more info, see https://github.com/mdx-js/mdx/issues/1041.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

loganmccaul picture loganmccaul  ·  13Comments

ChristianMurphy picture ChristianMurphy  ·  13Comments

johno picture johno  ·  53Comments

kylegach picture kylegach  ·  22Comments

pedrolamas picture pedrolamas  ·  14Comments