Pandoc: provide an option to strip comments

Created on 22 Nov 2015  Â·  12Comments  Â·  Source: jgm/pandoc

As partially discussed in issue #2535, it would be extremely helpful to have an option to strip comments from generated documents.

@jgm suggests --strip-comments as the name of the option.

Markdown reader

Most helpful comment

Actually, were you aware that you can include comments in pandoc's Markdown using YAML metadata blocks? These _won't_ be passed through to the output in any way:

Here's my document.

---
# Comment here. Invisible
# in the output
...

the rest.

All 12 comments

See also #1926.

Another alternative would be to strip specially marked comments by default. For example, comments beginning

<!---|

might be stripped, while others were treated as raw HTML and passed through. Would this be helpful? Would it be better than a command-line option?

Actually, were you aware that you can include comments in pandoc's Markdown using YAML metadata blocks? These _won't_ be passed through to the output in any way:

Here's my document.

---
# Comment here. Invisible
# in the output
...

the rest.

See also #1926.

Another alternative would be to strip specially marked comments by default. For example, comments beginning <!---| might be stripped, while others were treated as raw HTML and passed through. Would this be helpful? Would it be better than a command-line option?

A third kind of comments would be less helpful than removing comments with a --strip-comments option. (Although I think --keep-comments should be the alternative option and not the default.)

Actually, were you aware that you can include comments in pandoc's Markdown using YAML metadata blocks? These _won't_ be passed through to the output in any way:

Well, these are much more complex comments and they don’t allow inline comments.

  1. I would much prefer to have special comments; comments intended to end up as comments in the final doc are a thing, and are distinct to me from comments that should not end up in the final doc.
  2. I would like the option of either processing the special comments or not. My system has tools that use pandoc to transform the document for internal use (i.e. normalize links), and also tools that transform the document for output for end users; the two kinds of comments have distinct behaviour in those cases.
  3. The original use case I have here is converting TikiWiki (not TWiki, confusingly) markup to markdown; TikiWiki has both comments that end up in the rendered HTML and comments that do not. The problem with making a TikiWiki Reader that handles the latter is, well, you can't; since YAML comment blocks have no representation in Native, there's no way to make a Reader that stores those comments for rendering into markdown. So both kinds of comments need to exist in Native, so they can be emitted into markdown (or any other languages that support both types of comments).

I would also like the moon and stars, please! :D Sorry, I know that's probably a lot of work, but figured I might as well throw my thoughts in. The YAML blocks cover most of my purposes, with some care to avoid losing them during processing.

I've been trying to find a way to not output HTML comments contained in the input markdown, when converting from markdown to markdown_strict or markdown to html.

I have also tried with filters, but couldn't find one that worked.

You can use a YAML block with YAML comments:

# My comment here
# more here
...

+++ Hugo Roy [Sep 16 17 12:59 ]:

I've been trying to find a way to not output HTML comments contained in
the input markdown, when converting from markdown to markdown_strict or
markdown to html.

I have also tried with filters, but couldn't find one that worked.

—
You are receiving this because you were mentioned.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.

References

  1. https://github.com/jgm/pandoc/issues/2552#issuecomment-329991427
  2. https://github.com/notifications/unsubscribe-auth/AAAL5KHSG_8WyXRReNlrOLi5ZI2UJBmTks5sjCidgaJpZM4GnCYZ

It gets lost if you import into Pandoc, make changes, and then write
it out again, which makes it useless for my purposes.

On Sat, Sep 16, 2017 at 10:18:03PM +0000, John MacFarlane wrote:

You can use a YAML block with YAML comments:

# My comment here
# more here
...

+++ Hugo Roy [Sep 16 17 12:59 ]:

I've been trying to find a way to not output HTML comments contained in
the input markdown, when converting from markdown to markdown_strict or
markdown to html.

I have also tried with filters, but couldn't find one that worked.

—
You are receiving this because you were mentioned.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.

References

  1. https://github.com/jgm/pandoc/issues/2552#issuecomment-329991427
  2. https://github.com/notifications/unsubscribe-auth/AAAL5KHSG_8WyXRReNlrOLi5ZI2UJBmTks5sjCidgaJpZM4GnCYZ

--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/jgm/pandoc/issues/2552#issuecomment-329998124

What about inline comments made in the middle of a sentence. This is quite common when drafting especially with several people involved at different times.

Le 17 septembre 2017 00:18:03 GMT+02:00, John MacFarlane notifications@github.com a écrit :

You can use a YAML block with YAML comments:

# My comment here
# more here
...

+++ Hugo Roy [Sep 16 17 12:59 ]:

I've been trying to find a way to not output HTML comments
contained in
the input markdown, when converting from markdown to
markdown_strict or
markdown to html.

I have also tried with filters, but couldn't find one that worked.

—
You are receiving this because you were mentioned.
Reply to this email directly, [1]view it on GitHub, or [2]mute the
thread.

References

  1. https://github.com/jgm/pandoc/issues/2552#issuecomment-329991427
    2.
    https://github.com/notifications/unsubscribe-auth/AAAL5KHSG_8WyXRReNlrOLi5ZI2UJBmTks5sjCidgaJpZM4GnCYZ

--
Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez excuser ma brièveté.

There are several ways to go. Perhaps we could get feedback here.

  1. Provide a special way to mark HTML comments in Markdown as "do not include as raw HTML in the output." I think I once suggested using |; <!-- | do not include -->.

  2. Provide a command-line flag --omit-html-comments.

  3. Provide a format extension +strip_html_comments, so you could do --from markdown+strip_html_comments.

I just noticed that, with the dev version of pandoc, there is already a way to include comments. The dev version allows you to include arbitrary "raw" code, marked with a format; it will only appear in formats that support that format. E.g. `\emph{hi}`{=latex} gives you raw latex.

Well, just use a nonexistent format, like comment!

This is some text. `And this is an invisible
comment.`{=comment}  More text.

The invisible comment will appear in the native and json versions of the output, but won't appear in any other formats.

% pandoc
This is some text. `And this is an invisible
comment.`{=comment}  More text.
^D
<p>This is some text.  More text.</p>

Hi John,

Thank you for looking at this issue. I believe Option 2 is the best for
most users.

↪ John MacFarlane / septembre 17, 2017 20:14:

There are several ways to go. Perhaps we could get feedback here.

  1. Provide a special way to mark HTML comments in Markdown as "do not include as raw HTML in the output." I think I once suggested using |; <!-- | do not include -->.

HTML comments are already not convenient to write, adding “|” makes it
very, very difficult.

I am also not sure what is the usefulness of separating between normal
comments that would get in the output, and special comments that
wouldn't. This seems to be a very special use case better addressed with
filters and adding <div class="comments"></div>.

  1. Provide a command-line flag --omit-html-comments.

This looks like the ideal option for people who draft in markdown (and
use HTML comments) and then want to publish in html or markdown but, as
it is for a publication, do not want to include comments.

  1. Provide a format extension +strip_html_comments, so you could do --from markdown+strip_html_comments.

I just noticed that, with the dev version of pandoc, there is already a way to include comments. The dev version allows you to include arbitrary "raw" code, marked with a format; it will only appear in formats that support that format. E.g. `\emph{hi}`{=latex} gives you raw latex.

Well, just use a nonexistent format, like comment!

This is some text. `and this is an invisible
comment`{=comment}.  More text.

The invisible comment will appear in the native and json versions of the output, but won't appear in any other formats.

This looks quite complicated, and it would only work with people who
work specifically with pandoc in mind. So it will not help people who
write “normal” markdown with comments.

Thank you!

Was this page helpful?
0 / 5 - 0 ratings