Notebook: missing markdown features: image size and colspan

Created on 15 Feb 2016 · 17Comments · Source: jupyter/notebook

So I'm trying to write a book in a notebook. I'd like to be able to have stuff render in the notebook and work with nbconvert. Currently nbconvert / pandoc strips all html when converting to latex. Which leads me to writing markdown.

However, markdown is fairly limited as far as I can tell. The two things I currently struggle with are image size and "complex" tables:

There doesn't seem to be a way to specify the size of an image in markdown.
There doesn't seem to be a way to do colspan or rowspan in markdown.

Please correct me if I'm wrong.

To be able to write "complex" documents like books, it would be great to either extend the markdown, or improve the handling of html when converting. I'm not sure which route is more promising.

See
https://github.com/ipython/ipython/issues/3503
and
https://github.com/jupyter/nbconvert/issues/241

Source

amueller

👍3

Most helpful comment

I have exactly the problem of @amueller on the image size issue. I end up in writing a simple script with the following functions to convert between ![](){width=..} and <img></img> syntax.

re_mdimg = re.compile(r"(!\[([^\]]*)\]\(([^\)]*)\)\{(.+?)\})")
re_imgtag = re.compile(r"(<img ([^>]*?)(?:/>|>(.*?)</img>))")


def mdimg_to_imgtag(para):
    for (match, tag, link, opts) in re_mdimg.findall(para):
        w = re.search(r"width\s*=\s*(\S*)", opts)
        h = re.search(r"height\s*=\s*(\S*)", opts)
        width = "width: %s;" % w.group(1) if w else ""
        height = "height: %s;" % h.group(1) if h else ""
        style_opts = " style=\"%s%s\"" % (width, height) if w or h else ""

        if tag:
            para = para.replace(
                match,
                "<img src=\"%s\"%s>%s</img>" % (link, style_opts, tag))
        else:
            para = para.replace(
                match,
                "<img src=\"%s\"%s />" % (link, style_opts))
    return para


def imgtag_to_mdimg(para):
    for (match, opts, tag) in re_imgtag.findall(para):
        tag = tag if tag else ""
        l = re.search(r"""src\s*=\s*["'](\S*?)["']""", opts)
        w = re.search(r"""width\s*[=:]\s*["']?(\S+?)["'; ]""", opts)
        h = re.search(r"""height\s*[=:]\s*["']?(\S+?)["'; ]""", opts)
        style_opts = ""
        style_opts = "width=%s" % w.group(1) if w else ""
        if h:
            if style_opts:
                style_opts = style_opts + " "
            style_opts = style_opts + "height=%s" % h.group(1)
        para = para.replace(
            match,
            "![%s](%s){%s}" % (tag, l.group(1), style_opts))
    return para

randy3k on 20 Feb 2016

👍2

All 17 comments

multimarkdown looks interesting btw: http://fletcherpenney.net/multimarkdown/

amueller on 15 Feb 2016

Hm so it looks like image sizing was added to pandoc recently, so it will work with nbconvert, but it doesn't look like it works in the notebook itself: https://github.com/jgm/pandoc/commit/244cd5644b44f43722530379138bd7bb9cbace9b

amueller on 15 Feb 2016

It appears that there's no mention of that syntax in the latest CommonMark spec.

takluyver on 15 Feb 2016

We could add a custom renderer to marked. Not sure if the notebook already is messing with marked or if you're trying to stay away from that: https://github.com/chjj/marked/issues/339

amueller on 15 Feb 2016

does CommonMark include tables at all?

amueller on 15 Feb 2016

Sample notebook: https://gist.github.com/fb654d5e66aed5db2ed9 demonstrating the new pandoc image attributes syntax.

A PDF from jupyter nbconvert --to=pdf: Untitled.pdf, which is converted with the image size reduced.

The HTML version doesn't get converted with the size set according to the image attributes.

TomAugspurger on 15 Feb 2016

I think the only place where we're messing with marked is to prevent it from parsing latex-y math expressions, so that MathJax can do its stuff with them after the markdown is rendered.

takluyver on 15 Feb 2016

I'm asking pandoc for help here: https://github.com/jgm/pandoc/issues/2716

having html be interpreted by pandoc when using nbconvert would solve my problems.

amueller on 15 Feb 2016

@minrk proposed to do markdown -> html -> target to get around the pandoc restriction.

amueller on 15 Feb 2016

actually, that won't work. pandoc can't render colspan tables. If you try to convert a html table with colspan to latex, the table layout is discarded. If you don't have a colspan, the table is rendered correctly.

amueller on 15 Feb 2016

I have exactly the problem of @amueller on the image size issue. I end up in writing a simple script with the following functions to convert between ![](){width=..} and <img></img> syntax.

re_mdimg = re.compile(r"(!\[([^\]]*)\]\(([^\)]*)\)\{(.+?)\})")
re_imgtag = re.compile(r"(<img ([^>]*?)(?:/>|>(.*?)</img>))")


def mdimg_to_imgtag(para):
    for (match, tag, link, opts) in re_mdimg.findall(para):
        w = re.search(r"width\s*=\s*(\S*)", opts)
        h = re.search(r"height\s*=\s*(\S*)", opts)
        width = "width: %s;" % w.group(1) if w else ""
        height = "height: %s;" % h.group(1) if h else ""
        style_opts = " style=\"%s%s\"" % (width, height) if w or h else ""

        if tag:
            para = para.replace(
                match,
                "<img src=\"%s\"%s>%s</img>" % (link, style_opts, tag))
        else:
            para = para.replace(
                match,
                "<img src=\"%s\"%s />" % (link, style_opts))
    return para


def imgtag_to_mdimg(para):
    for (match, opts, tag) in re_imgtag.findall(para):
        tag = tag if tag else ""
        l = re.search(r"""src\s*=\s*["'](\S*?)["']""", opts)
        w = re.search(r"""width\s*[=:]\s*["']?(\S+?)["'; ]""", opts)
        h = re.search(r"""height\s*[=:]\s*["']?(\S+?)["'; ]""", opts)
        style_opts = ""
        style_opts = "width=%s" % w.group(1) if w else ""
        if h:
            if style_opts:
                style_opts = style_opts + " "
            style_opts = style_opts + "height=%s" % h.group(1)
        para = para.replace(
            match,
            "![%s](%s){%s}" % (tag, l.group(1), style_opts))
    return para

randy3k on 20 Feb 2016

👍2

I'm struggling with exactly this issue right now. I can use tag to resize image for markdown, but I would like to be able to print as PDF via latex, and all the tags get stripped completely :(

ianhbell on 29 Apr 2016

👍1

@randy3k where/how do you implement this script when converting to pdf?

soamaven on 29 Jun 2016

@soamaven
You can export the notebook as .tex or .md, then apply the functions and then compile the latex via pandoc/latex.

randy3k on 29 Jun 2016

Has there been any action on the basic issue of allowing image scaling directly in markdown?

cekees on 6 Apr 2017

I don't think so since that would require a different dialect of markdown.
Depending on what your target is, you might be able to use custom
nbconverter processors to achieve this, though.

Sent from phone. Please excuse spelling and brevity.

On Apr 6, 2017 11:11, "Chris Kees" notifications@github.com wrote:

Has there been any action on the basic issue of allowing image scaling
directly in markdown?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/jupyter/notebook/issues/1095#issuecomment-292205337,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAbcFpIZCy29Tg7H9bHbVHVqgtE_j5VFks5rtQCagaJpZM4HalzH
.

amueller on 6 Apr 2017

The problem with using HTML tag is that it doesn't support the new attachement type, where you can drag n' drop / paste from clipboard an image into a markdown cell. The HTML tag only works on local image files / links to images.