The Markdown stdlib has a lot of bugs (see e.g. https://github.com/JuliaLang/julia/issues?q=is%3Aopen+is%3Aissue+label%3Amarkdown where I labelled the ones I found from searching "markdown").
I wonder if we should just replace it with CommonMark.jl which seems to follow the spec better(?).
cc @MichaelHatherly @mortenpi
I think replacing the parser would, in principle, be great (both to just get a better parser, but also to follow the CommonMark spec). However, just swapping out the parser would be breaking:
I am wondering if we'd need to create a system to support the two parsers and ASTs simultaneously? Maybe a macro + Compat.jl solution to tag a module to indicate that it should use the new parsers? And when fetching docstrings, it would fetch the existing AST by default, but could be converted to the new one?
which seems to follow the spec better(?).
It currently passes all of the cm spec (https://github.com/MichaelHatherly/CommonMark.jl/blob/8706bda516a053fc55643478b1208680847c6afb/test/runtests.jl#L4-L15) so it's a fair bit more compliant, but as with all specs there's probably some untested corner cases that haven't been exercised properly.
A proper public api for ast manipulations is needed prior to making this a stdlib as well, currently requires digging into internal fields to do anything useful, possibly making use of AbstractTrees.jl if it provides enough of an interface (haven't looked into that).
With regards to actually replacing it, Morten is definitely right:
I am wondering if we'd need to create a system to support the two parsers and ASTs simultaneously?
Definitely doable.
With all that said, here's a possible multi-stage replacement plan:
@domluna is using it in https://github.com/domluna/JuliaFormatter.jl, I believe it's disabled by default there? So long as that package is being used by a good cross-section of users we may pick up some good edge-cases that parse wrong.
It's used for formatting docstrings which is not on by default. It's also used to format markdown files and julia code inside those files which is also off by default. Having it on by default would probably cause a bit of confusion.
Once you all come up with a plan, I can help with stdlib surgery which is super annoying but also very uninteresting.
I'm not planning on disappearing
Fool me once... ;)
Most helpful comment
It currently passes all of the cm spec (https://github.com/MichaelHatherly/CommonMark.jl/blob/8706bda516a053fc55643478b1208680847c6afb/test/runtests.jl#L4-L15) so it's a fair bit more compliant, but as with all specs there's probably some untested corner cases that haven't been exercised properly.
A proper public api for ast manipulations is needed prior to making this a stdlib as well, currently requires digging into internal fields to do anything useful, possibly making use of AbstractTrees.jl if it provides enough of an interface (haven't looked into that).
With regards to actually replacing it, Morten is definitely right:
Definitely doable.
With all that said, here's a possible multi-stage replacement plan: