Marked: Token type `paragraph` becomes `text` in master

Created on 2 May 2018  路  4Comments  路  Source: markedjs/marked

Marked version: 0.3.19 master

Markdown flavor: CommonMark ?

Test Script

'use strict';

const marked = require('marked');

const md = `
A Paragraph.

> A blockquote

`;

const tokens = marked.lexer(md);
console.log(tokens);

const html = marked.parser(tokens);
console.log(`\n${html}`);

Expectation

The output from the old version 0.3.6

[ { type: 'paragraph', text: 'A Paragraph.' },
  { type: 'blockquote_start' },
  { type: 'paragraph', text: 'A blockquote' },
  { type: 'blockquote_end' },
  links: {} ]

<p>A Paragraph.</p>
<blockquote>
<p>A blockquote</p>
</blockquote>

Result

The output from the 0.3.19 master:

[ { type: 'text', text: 'A Paragraph.' },
  { type: 'space' },
  { type: 'blockquote_start' },
  { type: 'text', text: 'A blockquote' },
  { type: 'blockquote_end' },
  links: {} ]

<p>A Paragraph.</p>
<blockquote>
<p>A blockquote</p>
</blockquote>

Is this cange intended?

L1 - broken

Most helpful comment

v0.3.19 actually outputs the correct type, it is master that outputs 'text' instead of 'paragraph'

looks like the culprit is https://github.com/markedjs/marked/commit/d08039e1f2cc87cd3ec9232f82deb538d956d3a9#diff-81ab0a5aa39b7a91951fc28d2d151496R36

image

Just adding that question mark back seems to fix this issue. I'm not sure if it introduces new issues.

@Feder1co5oave was that a mistake or is there a reason the question mark was removed?

looks like all tests pass when I just add the question mark back

All 4 comments

v0.3.19 actually outputs the correct type, it is master that outputs 'text' instead of 'paragraph'

looks like the culprit is https://github.com/markedjs/marked/commit/d08039e1f2cc87cd3ec9232f82deb538d956d3a9#diff-81ab0a5aa39b7a91951fc28d2d151496R36

image

Just adding that question mark back seems to fix this issue. I'm not sure if it introduces new issues.

@Feder1co5oave was that a mistake or is there a reason the question mark was removed?

looks like all tests pass when I just add the question mark back

I added a PR that adds back the question mark and adds a test for this issue #1248

@vsemozhetbyt thanks for reporting this 馃挴

@UziTech it made sense to me at the time, because the types of token matched by the negative lookahead that follows (these are the ones capable of interrupting paragraphs) should be on a new line anyway, but it was a naive change.

The matching of one \n is implicit from the leading [^\n]+ that eats up everything up until the first newline. So in theory the ? should be stripped away, but in practice you cannot, because of the + quantifier at the end of that group.

paragraph: /^([^\n]+(?:\n(?!hr|heading|lheading| {0,3}>|<\/?(?:tag)(?: +|\\n|\/?>)|<(?:script|pre|style|!--))[^\n]+)+)/,
//                                                                                                             this ^

In the case of a one-line paragraph like in the reported example, the required newline (followed by some other [^\n]+ text on a new line) cannot be found.
So you either put back the ? or you change the final + to a *, the grammar should be the same (the latter makes more sense to me at the moment).

Also, I just noticed that the \\n after (?:tag) should really be a \n, it was a copy&paste mishap from the html rule above.

Was this page helpful?
0 / 5 - 0 ratings