Gitea: Markdown links with whitespace can no longer be resolved in Wiki

Created on 10 Feb 2020  路  7Comments  路  Source: go-gitea/gitea

  • Gitea version (or commit ref): 1.11.0
  • Git version: -
  • Operating system: -
  • Database (use [x]):

    • [x] PostgreSQL

    • [x] MySQL

    • [ ] MSSQL

    • [ ] SQLite

  • Can you reproduce the bug at https://try.gitea.io:

    • [x] Yes (provide example URL)

    • [ ] No

    • [ ] Not relevant

  • Log gist: -

Description

Markdown links to other pages in wiki are no longer resolved if they have whitespace in them.
https://try.gitea.io/3red3x/wikitest/wiki/Home
https://try.gitea.io/3red3x/wikitest/wiki/Development-database-snapshot

Screenshots

Raw link:
chrome_2020-02-10_22-59-45
On Gitea 1.10.2:
chrome_2020-02-10_22-59-34
On Gitea 1.11.0:
firefox_2020-02-10_23-00-04

reviewenot-a-bug

Most helpful comment

FWIW this is intentional as it is not allowed in the commonmark spec:

https://spec.commonmark.org/0.29/#link-destination

a sequence of zero or more characters between an opening < and a closing > that contains no line breaks or unescaped < or > characters, or

a nonempty sequence of characters that does not start with <, does not include ASCII space or control characters, and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses. (Implementations may impose limits on parentheses nesting to avoid performance issues, but at least three levels of nesting should be supported.)

And we now use a new markdown library that follows that spec (as a feature).

You can find similar issues for other projects that follow commonmark (or use a library that does). So it's a regression in the sense it used to work when we used a library that didn't follow a standard, but if part of the goal of switching is to follow a standard then I suppose it is intentional now and a breaking change.

As per the spec, you can keep the page names with spaces you just need to enclose the links in <> brackets:

https://spec.commonmark.org/0.29/#example-486

So in your example it would need to look like:

[Development database snapshot](<Development database snapshot>)

All 7 comments

I have an identical ticket #10156

In my example I only use 2 markdown pages inside a directory with a space in the name and if we use the link between the to pages Gitea change the stace (%20) to signal +

That's a different issue - in your case there's improper entities handling when constructing link, but the link does get parsed. This issue tracks case where markdown links are not parsed and constructed at all when they contain space and are not an URL but relative name.

FWIW this is intentional as it is not allowed in the commonmark spec:

https://spec.commonmark.org/0.29/#link-destination

a sequence of zero or more characters between an opening < and a closing > that contains no line breaks or unescaped < or > characters, or

a nonempty sequence of characters that does not start with <, does not include ASCII space or control characters, and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses. (Implementations may impose limits on parentheses nesting to avoid performance issues, but at least three levels of nesting should be supported.)

And we now use a new markdown library that follows that spec (as a feature).

You can find similar issues for other projects that follow commonmark (or use a library that does). So it's a regression in the sense it used to work when we used a library that didn't follow a standard, but if part of the goal of switching is to follow a standard then I suppose it is intentional now and a breaking change.

As per the spec, you can keep the page names with spaces you just need to enclose the links in <> brackets:

https://spec.commonmark.org/0.29/#example-486

So in your example it would need to look like:

[Development database snapshot](<Development database snapshot>)

Related: #10291

This could probably be closed as a breaking change with switching to a markdown parser that follows a particular spec (which we listed as a reason for switching to it). Comment above is the new "right" way to use short links with spaces according to commonmark

Agreed, should be documented as breaking change however.

Closing as this was the expected behavior. Docs will be updated to clarify the case in #10223.

Was this page helpful?
0 / 5 - 0 ratings